You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by William Henry <wh...@redhat.com> on 2012/10/01 21:07:40 UTC

Re: Integration with AMQP

Hi,

Has anyone looked at this email?  Anyone care to express an opinion?

It seems like Apache has ActiveMQ and Qpid, which are already working on integrating, and now Kafka. Kafka might benefit by using Qpid/Proton just as ActiveMQ is trying to integrate with Qpid/Proton.

If folks are interested I'd be willing to take a look at the integration and help out.

Best regards,
William

----- Original Message -----
> Hi,
> 
> 
> Has anyone looked at integrating kafka with Apache Qpid to get AMQP
> support?
> 
> 
> Best,
> William

Re: Integration with AMQP

Posted by Jun Rao <ju...@gmail.com>.
William,

We haven't really thought about this. What's the benefit of AMQP and how
widely is it being adopted?

Thanks,

Jun

On Mon, Oct 1, 2012 at 12:07 PM, William Henry <wh...@redhat.com> wrote:

> Hi,
>
> Has anyone looked at this email?  Anyone care to express an opinion?
>
> It seems like Apache has ActiveMQ and Qpid, which are already working on
> integrating, and now Kafka. Kafka might benefit by using Qpid/Proton just
> as ActiveMQ is trying to integrate with Qpid/Proton.
>
> If folks are interested I'd be willing to take a look at the integration
> and help out.
>
> Best regards,
> William
>
> ----- Original Message -----
> > Hi,
> >
> >
> > Has anyone looked at integrating kafka with Apache Qpid to get AMQP
> > support?
> >
> >
> > Best,
> > William
>

Re: Integration with AMQP

Posted by William Henry <wh...@redhat.com>.
+1. I can help you in the same timeframe.  

William


On Oct 2, 2012, at 8:32 PM, Jay Kreps <ja...@gmail.com> wrote:

> I think a first step would be to do a detailed comparison of kafka
> semantics and the AMQP model and see how good the fit is. If no one else is
> game I would be willing to do this, though i probably wouldn't start for
> about a month. I think this would be more meaningful since unless we know
> the explicit drawbacks it is hard to really think concretely about the pros
> and cons.
> 
> -Jay
> 
> On Tue, Oct 2, 2012 at 5:27 PM, William Henry <wh...@redhat.com> wrote:
> 
>> 
>> 
>> ----- Original Message -----
>>> I'm not exactly sure about why talking the same at the wire level is
>>> an
>>> explicit goal; if raw blazing speed is also equally important.
>>> 
>>> Removing the wireformat from Kafka could be done through abstraction;
>>> but
>>> that would incur reinterpretation  costs(talking native Kafka) and
>>> take a
>>> performance hit.
>>> 
>>> I could make a similar argument over the second goal as well. It is
>>> not
>>> apparent that solving ALL problems through abstraction and then
>>> universally
>>> accepting a performance hit is that ideal. It may make
>>> purchasing/acquiring
>>> easier; but by adhering to the lowest common denominator.
>>> 
>>> Wouldn't it be better to make explicit tradeoffs for one-offs based
>>> on
>>> specific needs? i.e. if your architecture doesnt require zookeeper in
>>> Kafka
>>> for coordination, reduce the complexity. Don't force complexity in
>>> all
>>> cases.
>> 
>> Sure but if AMQP solves all the goals then why do so-called best-of-breed
>> "proprietary"?
>> 
>> Who said lowest common-denominator? If AMQP is lowest common-denominator
>> then it has failed.
>> 
>> Not sure how this is complexity either. Surely reusing an existing tried
>> and tested protocol instead of developing and maintaining a new one is less
>> complex.
>> 
>>> 
>>> In other words, Kafka is different enough from other messaging
>>> systems that
>>> to enforce a common contract (aka amqp) ,without incurring a
>>> significant
>>> performance hit, would be very challenging.
>> 
>> There is no metrics to say this would be a performance hit at all.
>> 
>> Again I'm not saying it is the right fit but I don't think we can conclude
>> that without investigating. And I haven't seen anything that would suggest
>> it can't fit.
>> 
>> William
>> 
>> 
>>> On Oct 2, 2012 3:22 PM, "William Henry" <wh...@redhat.com> wrote:
>>> 
>>>> 
>>>> 
>>>> ----- Original Message -----
>>>>> I looked into AMQP when I was first starting Kafka work. I see
>>>>> the
>>>>> crux of
>>>>> the issue as this: if you have a bunch of systems that
>>>>> essentially
>>>>> expose
>>>>> the same functionality there is value in standardizing the
>>>>> protocol
>>>>> by
>>>>> which they are accessed to help decouple interface from
>>>>> implementation. Of
>>>>> course I think it is better still to end up with a single good
>>>>> implementation (e.g. Linux rather than Posix). But invariably the
>>>>> protocol
>>>>> dictates the feature set, which dictates the implementation, and
>>>>> so
>>>>> this
>>>>> only really works if the systems have the same feature set and
>>>>> similar
>>>>> enough implementations. This becomes true in a domain over time
>>>>> as
>>>>> people
>>>>> learn the best way to build that kind of system, and all the
>>>>> systems
>>>>> converge to that.
>>>> 
>>>> +1
>>>> 
>>>>> 
>>>>> The reason we have not been pursuing this is that I think the set
>>>>> of
>>>>> functionality we are aiming for is a little different than what
>>>>> most
>>>>> message brokers have. Basically the idea we have is to attempt to
>>>>> re-imagine "messaging" or asynchronous processing infrastructure
>>>>> as a
>>>>> distributed, replicated, partitioned "commit log". This is
>>>>> different
>>>>> enough
>>>>> from what other system do that attempting to support a
>>>>> standardized
>>>>> protocol is unlikely to work out well. For example, the consumer
>>>>> balancing
>>>>> we do is not modeled in AMQP, and there are many AMQP features
>>>>> that
>>>>> Kafka
>>>>> doesn't have.
>>>> 
>>>> I need to understand your consumer balancing a bit more but AMQP is
>>>> designed not to be another MOM like traditional broker based
>>>> messaging
>>>> systems, though it does support that model.
>>>> 
>>>> I like to explain the goals of AMQP to be threefold (some may argue
>>>> differently):
>>>> 
>>>> 1) A Standard wire protocol for interoperability.  i.e. have all
>>>> messaging
>>>> systems speak the same on the wire.
>>>> 2) Handle all messaging use cases well - i.e. not just asynch, not
>>>> just
>>>> fanout, not just pub/sub but instead do it all so that AMQP is
>>>> applicable
>>>> to all use cases. Let's not have a "we do AMQP everywhere except X
>>>> because
>>>> it does do X very well.
>>>> 3) Must be fast. Even if it does 1 and 2 very well it will not be
>>>> adopted
>>>> by a wide range of applications.
>>>> 
>>>> So if by consumer balancing you mean multiple consumers feeding off
>>>> a
>>>> particular address/source/publisher/producer etc. then AMQP does
>>>> manage
>>>> that model.
>>>> 
>>>> 
>>>>> 
>>>>> Basically I don't really see other messaging systems as being
>>>>> fully
>>>>> formed
>>>>> distributed systems that acts as a *cluster* (rather than an
>>>>> ensemble
>>>>> of
>>>>> brokers).
>>>> 
>>>> This is exactly what we in the Qpid community are working towards
>>>> right
>>>> now.  I think AMQP as a protocol under Kafka and exploiting Kafka's
>>>> framework is a great idea.
>>>> 
>>>> Please look at the new Qpid/Proton work and some of Ted Ross's
>>>> (cc-ed)
>>>> router work.
>>>> 
>>>>> Conceptually when people program to, say, HDFS, you largely
>>>>> forget that under the covers it is a collection of data nodes and
>>>>> you
>>>>> think
>>>>> about it as a single entity. There are a number of points in the
>>>>> design
>>>>> that make this possible (and a number of areas where HDFS falls
>>>>> short). I
>>>>> think there is a lot to be gained by bringing to bear this modern
>>>>> style of
>>>>> distributed systems design in this space. Needless to say people
>>>>> who
>>>>> work
>>>>> on these other systems totally disagree with this assessment, so
>>>>> it
>>>>> is a
>>>>> bit of an experiment.
>>>> 
>>>> This is very interesting to me and some of the customers (at least
>>>> 2) I
>>>> work with.
>>>> 
>>>>> 
>>>>> I think an interesting analogy is to databases. Relational
>>>>> databases
>>>>> took
>>>>> this path to some extent. They started out with a very diverse
>>>>> feature set,
>>>>> and eventually converged to a fairly standard set of
>>>>> functionality
>>>>> with
>>>>> reasonable compatibility protocols (ODBC, JDBC). Distributed
>>>>> databases,
>>>>> though, are much more constrained and virtually always fail when
>>>>> they
>>>>> attempt to be compatible with centralized RDBMS's because they
>>>>> just
>>>>> can't
>>>>> do all the same stuff (but can do other things). I think as the
>>>>> distributed
>>>>> database space settles down it will become clear how to provide
>>>>> some
>>>>> kind
>>>>> of general protocol to standardize access, but trying to do that
>>>>> too
>>>>> soon
>>>>> wouldn't really help.
>>>>> 
>>>>> Another option, instead of making Kafka an AMPQ system, would be
>>>>> to
>>>>> try to
>>>>> make Kafka a multi-protocol system that supported many protocol's
>>>>> natively,
>>>>> sharing basic socket infrastructure. I have been down this path
>>>>> and
>>>>> it is a
>>>>> very hard road. I would not like to do that again.
>>>> 
>>>> I understand that.
>>>> 
>>>>> 
>>>>> That said it would be very interesting to see how well AMQP could
>>>>> be
>>>>> mapped
>>>>> to Kafka semantics, and there is nothing that prevents this
>>>>> experiment from
>>>>> happening outside the main codebase. It is totally possible to
>>>>> just
>>>>> call
>>>>> new KafkaServer(), access all the business logic from there, and
>>>>> wrap
>>>>> that
>>>>> in AMQP, REST, or any other protocol. That might be a good way to
>>>>> conduct
>>>>> the experiment if anyone is interested in trying it.
>>>> 
>>>> I would love to take a look at this. Any pointer on where an
>>>> integration
>>>> point might be would be welcome.  There is so much work in the AMQP
>>>> and
>>>> Qpid communities that Kafka could benefit from. You could
>>>> concentrate on
>>>> the "cluster" model and let Qpid/Proton handle the payload
>>>> distribution on
>>>> the wire.
>>>> 
>>>> I'm willing to take the risk that I might be wrong but right now I
>>>> don't
>>>> see where AMQP would fall down in this case.
>>>> 
>>>> Best regards,
>>>> William
>>>> 
>>>>> Cheers,
>>>>> 
>>>>> -Jay
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> On Mon, Oct 1, 2012 at 12:07 PM, William Henry
>>>>> <wh...@redhat.com>
>>>>> wrote:
>>>>> 
>>>>>> Hi,
>>>>>> 
>>>>>> Has anyone looked at this email?  Anyone care to express an
>>>>>> opinion?
>>>>>> 
>>>>>> It seems like Apache has ActiveMQ and Qpid, which are already
>>>>>> working on
>>>>>> integrating, and now Kafka. Kafka might benefit by using
>>>>>> Qpid/Proton just
>>>>>> as ActiveMQ is trying to integrate with Qpid/Proton.
>>>>>> 
>>>>>> If folks are interested I'd be willing to take a look at the
>>>>>> integration
>>>>>> and help out.
>>>>>> 
>>>>>> Best regards,
>>>>>> William
>>>>>> 
>>>>>> ----- Original Message -----
>>>>>>> Hi,
>>>>>>> 
>>>>>>> 
>>>>>>> Has anyone looked at integrating kafka with Apache Qpid to
>>>>>>> get
>>>>>>> AMQP
>>>>>>> support?
>>>>>>> 
>>>>>>> 
>>>>>>> Best,
>>>>>>> William
>> 

Re: Integration with AMQP

Posted by Jay Kreps <ja...@gmail.com>.
I think a first step would be to do a detailed comparison of kafka
semantics and the AMQP model and see how good the fit is. If no one else is
game I would be willing to do this, though i probably wouldn't start for
about a month. I think this would be more meaningful since unless we know
the explicit drawbacks it is hard to really think concretely about the pros
and cons.

-Jay

On Tue, Oct 2, 2012 at 5:27 PM, William Henry <wh...@redhat.com> wrote:

>
>
> ----- Original Message -----
> > I'm not exactly sure about why talking the same at the wire level is
> > an
> > explicit goal; if raw blazing speed is also equally important.
> >
> > Removing the wireformat from Kafka could be done through abstraction;
> > but
> > that would incur reinterpretation  costs(talking native Kafka) and
> > take a
> > performance hit.
> >
> > I could make a similar argument over the second goal as well. It is
> > not
> > apparent that solving ALL problems through abstraction and then
> > universally
> > accepting a performance hit is that ideal. It may make
> > purchasing/acquiring
> > easier; but by adhering to the lowest common denominator.
> >
> > Wouldn't it be better to make explicit tradeoffs for one-offs based
> > on
> > specific needs? i.e. if your architecture doesnt require zookeeper in
> > Kafka
> > for coordination, reduce the complexity. Don't force complexity in
> > all
> > cases.
>
> Sure but if AMQP solves all the goals then why do so-called best-of-breed
> "proprietary"?
>
> Who said lowest common-denominator? If AMQP is lowest common-denominator
> then it has failed.
>
> Not sure how this is complexity either. Surely reusing an existing tried
> and tested protocol instead of developing and maintaining a new one is less
> complex.
>
> >
> > In other words, Kafka is different enough from other messaging
> > systems that
> > to enforce a common contract (aka amqp) ,without incurring a
> > significant
> > performance hit, would be very challenging.
>
> There is no metrics to say this would be a performance hit at all.
>
> Again I'm not saying it is the right fit but I don't think we can conclude
> that without investigating. And I haven't seen anything that would suggest
> it can't fit.
>
> William
>
>
> >  On Oct 2, 2012 3:22 PM, "William Henry" <wh...@redhat.com> wrote:
> >
> > >
> > >
> > > ----- Original Message -----
> > > > I looked into AMQP when I was first starting Kafka work. I see
> > > > the
> > > > crux of
> > > > the issue as this: if you have a bunch of systems that
> > > > essentially
> > > > expose
> > > > the same functionality there is value in standardizing the
> > > > protocol
> > > > by
> > > > which they are accessed to help decouple interface from
> > > > implementation. Of
> > > > course I think it is better still to end up with a single good
> > > > implementation (e.g. Linux rather than Posix). But invariably the
> > > > protocol
> > > > dictates the feature set, which dictates the implementation, and
> > > > so
> > > > this
> > > > only really works if the systems have the same feature set and
> > > > similar
> > > > enough implementations. This becomes true in a domain over time
> > > > as
> > > > people
> > > > learn the best way to build that kind of system, and all the
> > > > systems
> > > > converge to that.
> > >
> > > +1
> > >
> > > >
> > > > The reason we have not been pursuing this is that I think the set
> > > > of
> > > > functionality we are aiming for is a little different than what
> > > > most
> > > > message brokers have. Basically the idea we have is to attempt to
> > > > re-imagine "messaging" or asynchronous processing infrastructure
> > > > as a
> > > > distributed, replicated, partitioned "commit log". This is
> > > > different
> > > > enough
> > > > from what other system do that attempting to support a
> > > > standardized
> > > > protocol is unlikely to work out well. For example, the consumer
> > > > balancing
> > > > we do is not modeled in AMQP, and there are many AMQP features
> > > > that
> > > > Kafka
> > > > doesn't have.
> > >
> > > I need to understand your consumer balancing a bit more but AMQP is
> > > designed not to be another MOM like traditional broker based
> > > messaging
> > > systems, though it does support that model.
> > >
> > > I like to explain the goals of AMQP to be threefold (some may argue
> > > differently):
> > >
> > > 1) A Standard wire protocol for interoperability.  i.e. have all
> > > messaging
> > > systems speak the same on the wire.
> > > 2) Handle all messaging use cases well - i.e. not just asynch, not
> > > just
> > > fanout, not just pub/sub but instead do it all so that AMQP is
> > > applicable
> > > to all use cases. Let's not have a "we do AMQP everywhere except X
> > > because
> > > it does do X very well.
> > > 3) Must be fast. Even if it does 1 and 2 very well it will not be
> > > adopted
> > > by a wide range of applications.
> > >
> > > So if by consumer balancing you mean multiple consumers feeding off
> > > a
> > > particular address/source/publisher/producer etc. then AMQP does
> > > manage
> > > that model.
> > >
> > >
> > > >
> > > > Basically I don't really see other messaging systems as being
> > > > fully
> > > > formed
> > > > distributed systems that acts as a *cluster* (rather than an
> > > > ensemble
> > > > of
> > > > brokers).
> > >
> > > This is exactly what we in the Qpid community are working towards
> > > right
> > > now.  I think AMQP as a protocol under Kafka and exploiting Kafka's
> > > framework is a great idea.
> > >
> > > Please look at the new Qpid/Proton work and some of Ted Ross's
> > > (cc-ed)
> > > router work.
> > >
> > > > Conceptually when people program to, say, HDFS, you largely
> > > > forget that under the covers it is a collection of data nodes and
> > > > you
> > > > think
> > > > about it as a single entity. There are a number of points in the
> > > > design
> > > > that make this possible (and a number of areas where HDFS falls
> > > > short). I
> > > > think there is a lot to be gained by bringing to bear this modern
> > > > style of
> > > > distributed systems design in this space. Needless to say people
> > > > who
> > > > work
> > > > on these other systems totally disagree with this assessment, so
> > > > it
> > > > is a
> > > > bit of an experiment.
> > >
> > > This is very interesting to me and some of the customers (at least
> > > 2) I
> > > work with.
> > >
> > > >
> > > > I think an interesting analogy is to databases. Relational
> > > > databases
> > > > took
> > > > this path to some extent. They started out with a very diverse
> > > > feature set,
> > > > and eventually converged to a fairly standard set of
> > > > functionality
> > > > with
> > > > reasonable compatibility protocols (ODBC, JDBC). Distributed
> > > > databases,
> > > > though, are much more constrained and virtually always fail when
> > > > they
> > > > attempt to be compatible with centralized RDBMS's because they
> > > > just
> > > > can't
> > > > do all the same stuff (but can do other things). I think as the
> > > > distributed
> > > > database space settles down it will become clear how to provide
> > > > some
> > > > kind
> > > > of general protocol to standardize access, but trying to do that
> > > > too
> > > > soon
> > > > wouldn't really help.
> > > >
> > > > Another option, instead of making Kafka an AMPQ system, would be
> > > > to
> > > > try to
> > > > make Kafka a multi-protocol system that supported many protocol's
> > > > natively,
> > > > sharing basic socket infrastructure. I have been down this path
> > > > and
> > > > it is a
> > > > very hard road. I would not like to do that again.
> > >
> > > I understand that.
> > >
> > > >
> > > > That said it would be very interesting to see how well AMQP could
> > > > be
> > > > mapped
> > > > to Kafka semantics, and there is nothing that prevents this
> > > > experiment from
> > > > happening outside the main codebase. It is totally possible to
> > > > just
> > > > call
> > > > new KafkaServer(), access all the business logic from there, and
> > > > wrap
> > > > that
> > > > in AMQP, REST, or any other protocol. That might be a good way to
> > > > conduct
> > > > the experiment if anyone is interested in trying it.
> > > >
> > >
> > > I would love to take a look at this. Any pointer on where an
> > > integration
> > > point might be would be welcome.  There is so much work in the AMQP
> > > and
> > > Qpid communities that Kafka could benefit from. You could
> > > concentrate on
> > > the "cluster" model and let Qpid/Proton handle the payload
> > > distribution on
> > > the wire.
> > >
> > > I'm willing to take the risk that I might be wrong but right now I
> > > don't
> > > see where AMQP would fall down in this case.
> > >
> > > Best regards,
> > > William
> > >
> > > > Cheers,
> > > >
> > > > -Jay
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > On Mon, Oct 1, 2012 at 12:07 PM, William Henry
> > > > <wh...@redhat.com>
> > > > wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > > Has anyone looked at this email?  Anyone care to express an
> > > > > opinion?
> > > > >
> > > > > It seems like Apache has ActiveMQ and Qpid, which are already
> > > > > working on
> > > > > integrating, and now Kafka. Kafka might benefit by using
> > > > > Qpid/Proton just
> > > > > as ActiveMQ is trying to integrate with Qpid/Proton.
> > > > >
> > > > > If folks are interested I'd be willing to take a look at the
> > > > > integration
> > > > > and help out.
> > > > >
> > > > > Best regards,
> > > > > William
> > > > >
> > > > > ----- Original Message -----
> > > > > > Hi,
> > > > > >
> > > > > >
> > > > > > Has anyone looked at integrating kafka with Apache Qpid to
> > > > > > get
> > > > > > AMQP
> > > > > > support?
> > > > > >
> > > > > >
> > > > > > Best,
> > > > > > William
> > > > >
> > > >
> > >
> >
>

Re: Integration with AMQP

Posted by William Henry <wh...@redhat.com>.

----- Original Message -----
> I'm not exactly sure about why talking the same at the wire level is
> an
> explicit goal; if raw blazing speed is also equally important.
> 
> Removing the wireformat from Kafka could be done through abstraction;
> but
> that would incur reinterpretation  costs(talking native Kafka) and
> take a
> performance hit.
> 
> I could make a similar argument over the second goal as well. It is
> not
> apparent that solving ALL problems through abstraction and then
> universally
> accepting a performance hit is that ideal. It may make
> purchasing/acquiring
> easier; but by adhering to the lowest common denominator.
> 
> Wouldn't it be better to make explicit tradeoffs for one-offs based
> on
> specific needs? i.e. if your architecture doesnt require zookeeper in
> Kafka
> for coordination, reduce the complexity. Don't force complexity in
> all
> cases.

Sure but if AMQP solves all the goals then why do so-called best-of-breed "proprietary"?

Who said lowest common-denominator? If AMQP is lowest common-denominator then it has failed. 

Not sure how this is complexity either. Surely reusing an existing tried and tested protocol instead of developing and maintaining a new one is less complex. 

> 
> In other words, Kafka is different enough from other messaging
> systems that
> to enforce a common contract (aka amqp) ,without incurring a
> significant
> performance hit, would be very challenging.

There is no metrics to say this would be a performance hit at all.

Again I'm not saying it is the right fit but I don't think we can conclude that without investigating. And I haven't seen anything that would suggest it can't fit.

William


>  On Oct 2, 2012 3:22 PM, "William Henry" <wh...@redhat.com> wrote:
> 
> >
> >
> > ----- Original Message -----
> > > I looked into AMQP when I was first starting Kafka work. I see
> > > the
> > > crux of
> > > the issue as this: if you have a bunch of systems that
> > > essentially
> > > expose
> > > the same functionality there is value in standardizing the
> > > protocol
> > > by
> > > which they are accessed to help decouple interface from
> > > implementation. Of
> > > course I think it is better still to end up with a single good
> > > implementation (e.g. Linux rather than Posix). But invariably the
> > > protocol
> > > dictates the feature set, which dictates the implementation, and
> > > so
> > > this
> > > only really works if the systems have the same feature set and
> > > similar
> > > enough implementations. This becomes true in a domain over time
> > > as
> > > people
> > > learn the best way to build that kind of system, and all the
> > > systems
> > > converge to that.
> >
> > +1
> >
> > >
> > > The reason we have not been pursuing this is that I think the set
> > > of
> > > functionality we are aiming for is a little different than what
> > > most
> > > message brokers have. Basically the idea we have is to attempt to
> > > re-imagine "messaging" or asynchronous processing infrastructure
> > > as a
> > > distributed, replicated, partitioned "commit log". This is
> > > different
> > > enough
> > > from what other system do that attempting to support a
> > > standardized
> > > protocol is unlikely to work out well. For example, the consumer
> > > balancing
> > > we do is not modeled in AMQP, and there are many AMQP features
> > > that
> > > Kafka
> > > doesn't have.
> >
> > I need to understand your consumer balancing a bit more but AMQP is
> > designed not to be another MOM like traditional broker based
> > messaging
> > systems, though it does support that model.
> >
> > I like to explain the goals of AMQP to be threefold (some may argue
> > differently):
> >
> > 1) A Standard wire protocol for interoperability.  i.e. have all
> > messaging
> > systems speak the same on the wire.
> > 2) Handle all messaging use cases well - i.e. not just asynch, not
> > just
> > fanout, not just pub/sub but instead do it all so that AMQP is
> > applicable
> > to all use cases. Let's not have a "we do AMQP everywhere except X
> > because
> > it does do X very well.
> > 3) Must be fast. Even if it does 1 and 2 very well it will not be
> > adopted
> > by a wide range of applications.
> >
> > So if by consumer balancing you mean multiple consumers feeding off
> > a
> > particular address/source/publisher/producer etc. then AMQP does
> > manage
> > that model.
> >
> >
> > >
> > > Basically I don't really see other messaging systems as being
> > > fully
> > > formed
> > > distributed systems that acts as a *cluster* (rather than an
> > > ensemble
> > > of
> > > brokers).
> >
> > This is exactly what we in the Qpid community are working towards
> > right
> > now.  I think AMQP as a protocol under Kafka and exploiting Kafka's
> > framework is a great idea.
> >
> > Please look at the new Qpid/Proton work and some of Ted Ross's
> > (cc-ed)
> > router work.
> >
> > > Conceptually when people program to, say, HDFS, you largely
> > > forget that under the covers it is a collection of data nodes and
> > > you
> > > think
> > > about it as a single entity. There are a number of points in the
> > > design
> > > that make this possible (and a number of areas where HDFS falls
> > > short). I
> > > think there is a lot to be gained by bringing to bear this modern
> > > style of
> > > distributed systems design in this space. Needless to say people
> > > who
> > > work
> > > on these other systems totally disagree with this assessment, so
> > > it
> > > is a
> > > bit of an experiment.
> >
> > This is very interesting to me and some of the customers (at least
> > 2) I
> > work with.
> >
> > >
> > > I think an interesting analogy is to databases. Relational
> > > databases
> > > took
> > > this path to some extent. They started out with a very diverse
> > > feature set,
> > > and eventually converged to a fairly standard set of
> > > functionality
> > > with
> > > reasonable compatibility protocols (ODBC, JDBC). Distributed
> > > databases,
> > > though, are much more constrained and virtually always fail when
> > > they
> > > attempt to be compatible with centralized RDBMS's because they
> > > just
> > > can't
> > > do all the same stuff (but can do other things). I think as the
> > > distributed
> > > database space settles down it will become clear how to provide
> > > some
> > > kind
> > > of general protocol to standardize access, but trying to do that
> > > too
> > > soon
> > > wouldn't really help.
> > >
> > > Another option, instead of making Kafka an AMPQ system, would be
> > > to
> > > try to
> > > make Kafka a multi-protocol system that supported many protocol's
> > > natively,
> > > sharing basic socket infrastructure. I have been down this path
> > > and
> > > it is a
> > > very hard road. I would not like to do that again.
> >
> > I understand that.
> >
> > >
> > > That said it would be very interesting to see how well AMQP could
> > > be
> > > mapped
> > > to Kafka semantics, and there is nothing that prevents this
> > > experiment from
> > > happening outside the main codebase. It is totally possible to
> > > just
> > > call
> > > new KafkaServer(), access all the business logic from there, and
> > > wrap
> > > that
> > > in AMQP, REST, or any other protocol. That might be a good way to
> > > conduct
> > > the experiment if anyone is interested in trying it.
> > >
> >
> > I would love to take a look at this. Any pointer on where an
> > integration
> > point might be would be welcome.  There is so much work in the AMQP
> > and
> > Qpid communities that Kafka could benefit from. You could
> > concentrate on
> > the "cluster" model and let Qpid/Proton handle the payload
> > distribution on
> > the wire.
> >
> > I'm willing to take the risk that I might be wrong but right now I
> > don't
> > see where AMQP would fall down in this case.
> >
> > Best regards,
> > William
> >
> > > Cheers,
> > >
> > > -Jay
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > > On Mon, Oct 1, 2012 at 12:07 PM, William Henry
> > > <wh...@redhat.com>
> > > wrote:
> > >
> > > > Hi,
> > > >
> > > > Has anyone looked at this email?  Anyone care to express an
> > > > opinion?
> > > >
> > > > It seems like Apache has ActiveMQ and Qpid, which are already
> > > > working on
> > > > integrating, and now Kafka. Kafka might benefit by using
> > > > Qpid/Proton just
> > > > as ActiveMQ is trying to integrate with Qpid/Proton.
> > > >
> > > > If folks are interested I'd be willing to take a look at the
> > > > integration
> > > > and help out.
> > > >
> > > > Best regards,
> > > > William
> > > >
> > > > ----- Original Message -----
> > > > > Hi,
> > > > >
> > > > >
> > > > > Has anyone looked at integrating kafka with Apache Qpid to
> > > > > get
> > > > > AMQP
> > > > > support?
> > > > >
> > > > >
> > > > > Best,
> > > > > William
> > > >
> > >
> >
> 

Re: Integration with AMQP

Posted by Milind Parikh <mi...@gmail.com>.
I'm not exactly sure about why talking the same at the wire level is an
explicit goal; if raw blazing speed is also equally important.

Removing the wireformat from Kafka could be done through abstraction; but
that would incur reinterpretation  costs(talking native Kafka) and take a
performance hit.

I could make a similar argument over the second goal as well. It is not
apparent that solving ALL problems through abstraction and then universally
accepting a performance hit is that ideal. It may make purchasing/acquiring
easier; but by adhering to the lowest common denominator.

Wouldn't it be better to make explicit tradeoffs for one-offs based on
specific needs? i.e. if your architecture doesnt require zookeeper in Kafka
for coordination, reduce the complexity. Don't force complexity in all
cases.

In other words, Kafka is different enough from other messaging systems that
to enforce a common contract (aka amqp) ,without incurring a significant
performance hit, would be very challenging.
 On Oct 2, 2012 3:22 PM, "William Henry" <wh...@redhat.com> wrote:

>
>
> ----- Original Message -----
> > I looked into AMQP when I was first starting Kafka work. I see the
> > crux of
> > the issue as this: if you have a bunch of systems that essentially
> > expose
> > the same functionality there is value in standardizing the protocol
> > by
> > which they are accessed to help decouple interface from
> > implementation. Of
> > course I think it is better still to end up with a single good
> > implementation (e.g. Linux rather than Posix). But invariably the
> > protocol
> > dictates the feature set, which dictates the implementation, and so
> > this
> > only really works if the systems have the same feature set and
> > similar
> > enough implementations. This becomes true in a domain over time as
> > people
> > learn the best way to build that kind of system, and all the systems
> > converge to that.
>
> +1
>
> >
> > The reason we have not been pursuing this is that I think the set of
> > functionality we are aiming for is a little different than what most
> > message brokers have. Basically the idea we have is to attempt to
> > re-imagine "messaging" or asynchronous processing infrastructure as a
> > distributed, replicated, partitioned "commit log". This is different
> > enough
> > from what other system do that attempting to support a standardized
> > protocol is unlikely to work out well. For example, the consumer
> > balancing
> > we do is not modeled in AMQP, and there are many AMQP features that
> > Kafka
> > doesn't have.
>
> I need to understand your consumer balancing a bit more but AMQP is
> designed not to be another MOM like traditional broker based messaging
> systems, though it does support that model.
>
> I like to explain the goals of AMQP to be threefold (some may argue
> differently):
>
> 1) A Standard wire protocol for interoperability.  i.e. have all messaging
> systems speak the same on the wire.
> 2) Handle all messaging use cases well - i.e. not just asynch, not just
> fanout, not just pub/sub but instead do it all so that AMQP is applicable
> to all use cases. Let's not have a "we do AMQP everywhere except X because
> it does do X very well.
> 3) Must be fast. Even if it does 1 and 2 very well it will not be adopted
> by a wide range of applications.
>
> So if by consumer balancing you mean multiple consumers feeding off a
> particular address/source/publisher/producer etc. then AMQP does manage
> that model.
>
>
> >
> > Basically I don't really see other messaging systems as being fully
> > formed
> > distributed systems that acts as a *cluster* (rather than an ensemble
> > of
> > brokers).
>
> This is exactly what we in the Qpid community are working towards right
> now.  I think AMQP as a protocol under Kafka and exploiting Kafka's
> framework is a great idea.
>
> Please look at the new Qpid/Proton work and some of Ted Ross's (cc-ed)
> router work.
>
> > Conceptually when people program to, say, HDFS, you largely
> > forget that under the covers it is a collection of data nodes and you
> > think
> > about it as a single entity. There are a number of points in the
> > design
> > that make this possible (and a number of areas where HDFS falls
> > short). I
> > think there is a lot to be gained by bringing to bear this modern
> > style of
> > distributed systems design in this space. Needless to say people who
> > work
> > on these other systems totally disagree with this assessment, so it
> > is a
> > bit of an experiment.
>
> This is very interesting to me and some of the customers (at least 2) I
> work with.
>
> >
> > I think an interesting analogy is to databases. Relational databases
> > took
> > this path to some extent. They started out with a very diverse
> > feature set,
> > and eventually converged to a fairly standard set of functionality
> > with
> > reasonable compatibility protocols (ODBC, JDBC). Distributed
> > databases,
> > though, are much more constrained and virtually always fail when they
> > attempt to be compatible with centralized RDBMS's because they just
> > can't
> > do all the same stuff (but can do other things). I think as the
> > distributed
> > database space settles down it will become clear how to provide some
> > kind
> > of general protocol to standardize access, but trying to do that too
> > soon
> > wouldn't really help.
> >
> > Another option, instead of making Kafka an AMPQ system, would be to
> > try to
> > make Kafka a multi-protocol system that supported many protocol's
> > natively,
> > sharing basic socket infrastructure. I have been down this path and
> > it is a
> > very hard road. I would not like to do that again.
>
> I understand that.
>
> >
> > That said it would be very interesting to see how well AMQP could be
> > mapped
> > to Kafka semantics, and there is nothing that prevents this
> > experiment from
> > happening outside the main codebase. It is totally possible to just
> > call
> > new KafkaServer(), access all the business logic from there, and wrap
> > that
> > in AMQP, REST, or any other protocol. That might be a good way to
> > conduct
> > the experiment if anyone is interested in trying it.
> >
>
> I would love to take a look at this. Any pointer on where an integration
> point might be would be welcome.  There is so much work in the AMQP and
> Qpid communities that Kafka could benefit from. You could concentrate on
> the "cluster" model and let Qpid/Proton handle the payload distribution on
> the wire.
>
> I'm willing to take the risk that I might be wrong but right now I don't
> see where AMQP would fall down in this case.
>
> Best regards,
> William
>
> > Cheers,
> >
> > -Jay
> >
> >
> >
> >
> >
> >
> >
> > On Mon, Oct 1, 2012 at 12:07 PM, William Henry <wh...@redhat.com>
> > wrote:
> >
> > > Hi,
> > >
> > > Has anyone looked at this email?  Anyone care to express an
> > > opinion?
> > >
> > > It seems like Apache has ActiveMQ and Qpid, which are already
> > > working on
> > > integrating, and now Kafka. Kafka might benefit by using
> > > Qpid/Proton just
> > > as ActiveMQ is trying to integrate with Qpid/Proton.
> > >
> > > If folks are interested I'd be willing to take a look at the
> > > integration
> > > and help out.
> > >
> > > Best regards,
> > > William
> > >
> > > ----- Original Message -----
> > > > Hi,
> > > >
> > > >
> > > > Has anyone looked at integrating kafka with Apache Qpid to get
> > > > AMQP
> > > > support?
> > > >
> > > >
> > > > Best,
> > > > William
> > >
> >
>

Re: Integration with AMQP

Posted by William Henry <wh...@redhat.com>.
Also FYI. I plan to be at ApacheCon Eu and would love to talk to Kafka folks there perhaps over a pint.

:)

William

----- Original Message -----
> 
> 
> ----- Original Message -----
> > I'd like to understand the use case more - why would you want to do
> > this
> > exactly (use AMQP + Kafka) other than just because?
> 
> Good point.
> 
> See my previous email. It's not a just because. I think it is
> precisely the kind of synergy that projects like Qpid/Proton are
> working towards. We'd love to see AMQP 1.0 be ubiquitous. However we
> understand that at a higher level there are frameworks and
> architectures that have specific value to users.
> 
> I've recently been working on integrating Qpid/Proton with OpenMAMA
> as an example. And I'm working on interoperability with Microsoft's
> and other middleware products.
> 
> Again, the goal of AMQP was to handle all messaging use cases and if
> it can't handle Kafka's workload then AMQP needs to address that.
> 
> 
> So    I could see:  Kafka using Proton which provides AMQP 1.0
> 
> Best,
> William
> 
> 
> > 
> > On Tue, Oct 2, 2012 at 9:22 AM, Jay Kreps <ja...@gmail.com>
> > wrote:
> > 
> > > I looked into AMQP when I was first starting Kafka work. I see
> > > the
> > > crux of
> > > the issue as this: if you have a bunch of systems that
> > > essentially
> > > expose
> > > the same functionality there is value in standardizing the
> > > protocol
> > > by
> > > which they are accessed to help decouple interface from
> > > implementation. Of
> > > course I think it is better still to end up with a single good
> > > implementation (e.g. Linux rather than Posix). But invariably the
> > > protocol
> > > dictates the feature set, which dictates the implementation, and
> > > so
> > > this
> > > only really works if the systems have the same feature set and
> > > similar
> > > enough implementations. This becomes true in a domain over time
> > > as
> > > people
> > > learn the best way to build that kind of system, and all the
> > > systems
> > > converge to that.
> > >
> > > The reason we have not been pursuing this is that I think the set
> > > of
> > > functionality we are aiming for is a little different than what
> > > most
> > > message brokers have. Basically the idea we have is to attempt to
> > > re-imagine "messaging" or asynchronous processing infrastructure
> > > as
> > > a
> > > distributed, replicated, partitioned "commit log". This is
> > > different enough
> > > from what other system do that attempting to support a
> > > standardized
> > > protocol is unlikely to work out well. For example, the consumer
> > > balancing
> > > we do is not modeled in AMQP, and there are many AMQP features
> > > that
> > > Kafka
> > > doesn't have.
> > >
> > > Basically I don't really see other messaging systems as being
> > > fully
> > > formed
> > > distributed systems that acts as a *cluster* (rather than an
> > > ensemble of
> > > brokers). Conceptually when people program to, say, HDFS, you
> > > largely
> > > forget that under the covers it is a collection of data nodes and
> > > you think
> > > about it as a single entity. There are a number of points in the
> > > design
> > > that make this possible (and a number of areas where HDFS falls
> > > short). I
> > > think there is a lot to be gained by bringing to bear this modern
> > > style of
> > > distributed systems design in this space. Needless to say people
> > > who work
> > > on these other systems totally disagree with this assessment, so
> > > it
> > > is a
> > > bit of an experiment.
> > >
> > > I think an interesting analogy is to databases. Relational
> > > databases took
> > > this path to some extent. They started out with a very diverse
> > > feature set,
> > > and eventually converged to a fairly standard set of
> > > functionality
> > > with
> > > reasonable compatibility protocols (ODBC, JDBC). Distributed
> > > databases,
> > > though, are much more constrained and virtually always fail when
> > > they
> > > attempt to be compatible with centralized RDBMS's because they
> > > just
> > > can't
> > > do all the same stuff (but can do other things). I think as the
> > > distributed
> > > database space settles down it will become clear how to provide
> > > some kind
> > > of general protocol to standardize access, but trying to do that
> > > too soon
> > > wouldn't really help.
> > >
> > > Another option, instead of making Kafka an AMPQ system, would be
> > > to
> > > try to
> > > make Kafka a multi-protocol system that supported many protocol's
> > > natively,
> > > sharing basic socket infrastructure. I have been down this path
> > > and
> > > it is a
> > > very hard road. I would not like to do that again.
> > >
> > > That said it would be very interesting to see how well AMQP could
> > > be mapped
> > > to Kafka semantics, and there is nothing that prevents this
> > > experiment from
> > > happening outside the main codebase. It is totally possible to
> > > just
> > > call
> > > new KafkaServer(), access all the business logic from there, and
> > > wrap that
> > > in AMQP, REST, or any other protocol. That might be a good way to
> > > conduct
> > > the experiment if anyone is interested in trying it.
> > >
> > > Cheers,
> > >
> > > -Jay
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > > On Mon, Oct 1, 2012 at 12:07 PM, William Henry
> > > <wh...@redhat.com>
> > > wrote:
> > >
> > > > Hi,
> > > >
> > > > Has anyone looked at this email?  Anyone care to express an
> > > > opinion?
> > > >
> > > > It seems like Apache has ActiveMQ and Qpid, which are already
> > > > working on
> > > > integrating, and now Kafka. Kafka might benefit by using
> > > > Qpid/Proton just
> > > > as ActiveMQ is trying to integrate with Qpid/Proton.
> > > >
> > > > If folks are interested I'd be willing to take a look at the
> > > > integration
> > > > and help out.
> > > >
> > > > Best regards,
> > > > William
> > > >
> > > > ----- Original Message -----
> > > > > Hi,
> > > > >
> > > > >
> > > > > Has anyone looked at integrating kafka with Apache Qpid to
> > > > > get
> > > > > AMQP
> > > > > support?
> > > > >
> > > > >
> > > > > Best,
> > > > > William
> > > >
> > >
> > 
> 

Re: Integration with AMQP

Posted by William Henry <wh...@redhat.com>.

----- Original Message -----
> I'd like to understand the use case more - why would you want to do
> this
> exactly (use AMQP + Kafka) other than just because?

Good point.

See my previous email. It's not a just because. I think it is precisely the kind of synergy that projects like Qpid/Proton are working towards. We'd love to see AMQP 1.0 be ubiquitous. However we understand that at a higher level there are frameworks and architectures that have specific value to users.

I've recently been working on integrating Qpid/Proton with OpenMAMA as an example. And I'm working on interoperability with Microsoft's and other middleware products.

Again, the goal of AMQP was to handle all messaging use cases and if it can't handle Kafka's workload then AMQP needs to address that. 


So    I could see:  Kafka using Proton which provides AMQP 1.0

Best,
William


> 
> On Tue, Oct 2, 2012 at 9:22 AM, Jay Kreps <ja...@gmail.com>
> wrote:
> 
> > I looked into AMQP when I was first starting Kafka work. I see the
> > crux of
> > the issue as this: if you have a bunch of systems that essentially
> > expose
> > the same functionality there is value in standardizing the protocol
> > by
> > which they are accessed to help decouple interface from
> > implementation. Of
> > course I think it is better still to end up with a single good
> > implementation (e.g. Linux rather than Posix). But invariably the
> > protocol
> > dictates the feature set, which dictates the implementation, and so
> > this
> > only really works if the systems have the same feature set and
> > similar
> > enough implementations. This becomes true in a domain over time as
> > people
> > learn the best way to build that kind of system, and all the
> > systems
> > converge to that.
> >
> > The reason we have not been pursuing this is that I think the set
> > of
> > functionality we are aiming for is a little different than what
> > most
> > message brokers have. Basically the idea we have is to attempt to
> > re-imagine "messaging" or asynchronous processing infrastructure as
> > a
> > distributed, replicated, partitioned "commit log". This is
> > different enough
> > from what other system do that attempting to support a standardized
> > protocol is unlikely to work out well. For example, the consumer
> > balancing
> > we do is not modeled in AMQP, and there are many AMQP features that
> > Kafka
> > doesn't have.
> >
> > Basically I don't really see other messaging systems as being fully
> > formed
> > distributed systems that acts as a *cluster* (rather than an
> > ensemble of
> > brokers). Conceptually when people program to, say, HDFS, you
> > largely
> > forget that under the covers it is a collection of data nodes and
> > you think
> > about it as a single entity. There are a number of points in the
> > design
> > that make this possible (and a number of areas where HDFS falls
> > short). I
> > think there is a lot to be gained by bringing to bear this modern
> > style of
> > distributed systems design in this space. Needless to say people
> > who work
> > on these other systems totally disagree with this assessment, so it
> > is a
> > bit of an experiment.
> >
> > I think an interesting analogy is to databases. Relational
> > databases took
> > this path to some extent. They started out with a very diverse
> > feature set,
> > and eventually converged to a fairly standard set of functionality
> > with
> > reasonable compatibility protocols (ODBC, JDBC). Distributed
> > databases,
> > though, are much more constrained and virtually always fail when
> > they
> > attempt to be compatible with centralized RDBMS's because they just
> > can't
> > do all the same stuff (but can do other things). I think as the
> > distributed
> > database space settles down it will become clear how to provide
> > some kind
> > of general protocol to standardize access, but trying to do that
> > too soon
> > wouldn't really help.
> >
> > Another option, instead of making Kafka an AMPQ system, would be to
> > try to
> > make Kafka a multi-protocol system that supported many protocol's
> > natively,
> > sharing basic socket infrastructure. I have been down this path and
> > it is a
> > very hard road. I would not like to do that again.
> >
> > That said it would be very interesting to see how well AMQP could
> > be mapped
> > to Kafka semantics, and there is nothing that prevents this
> > experiment from
> > happening outside the main codebase. It is totally possible to just
> > call
> > new KafkaServer(), access all the business logic from there, and
> > wrap that
> > in AMQP, REST, or any other protocol. That might be a good way to
> > conduct
> > the experiment if anyone is interested in trying it.
> >
> > Cheers,
> >
> > -Jay
> >
> >
> >
> >
> >
> >
> >
> > On Mon, Oct 1, 2012 at 12:07 PM, William Henry <wh...@redhat.com>
> > wrote:
> >
> > > Hi,
> > >
> > > Has anyone looked at this email?  Anyone care to express an
> > > opinion?
> > >
> > > It seems like Apache has ActiveMQ and Qpid, which are already
> > > working on
> > > integrating, and now Kafka. Kafka might benefit by using
> > > Qpid/Proton just
> > > as ActiveMQ is trying to integrate with Qpid/Proton.
> > >
> > > If folks are interested I'd be willing to take a look at the
> > > integration
> > > and help out.
> > >
> > > Best regards,
> > > William
> > >
> > > ----- Original Message -----
> > > > Hi,
> > > >
> > > >
> > > > Has anyone looked at integrating kafka with Apache Qpid to get
> > > > AMQP
> > > > support?
> > > >
> > > >
> > > > Best,
> > > > William
> > >
> >
> 

Re: Integration with AMQP

Posted by Taylor Gautier <tg...@gmail.com>.
I'd like to understand the use case more - why would you want to do this
exactly (use AMQP + Kafka) other than just because?

On Tue, Oct 2, 2012 at 9:22 AM, Jay Kreps <ja...@gmail.com> wrote:

> I looked into AMQP when I was first starting Kafka work. I see the crux of
> the issue as this: if you have a bunch of systems that essentially expose
> the same functionality there is value in standardizing the protocol by
> which they are accessed to help decouple interface from implementation. Of
> course I think it is better still to end up with a single good
> implementation (e.g. Linux rather than Posix). But invariably the protocol
> dictates the feature set, which dictates the implementation, and so this
> only really works if the systems have the same feature set and similar
> enough implementations. This becomes true in a domain over time as people
> learn the best way to build that kind of system, and all the systems
> converge to that.
>
> The reason we have not been pursuing this is that I think the set of
> functionality we are aiming for is a little different than what most
> message brokers have. Basically the idea we have is to attempt to
> re-imagine "messaging" or asynchronous processing infrastructure as a
> distributed, replicated, partitioned "commit log". This is different enough
> from what other system do that attempting to support a standardized
> protocol is unlikely to work out well. For example, the consumer balancing
> we do is not modeled in AMQP, and there are many AMQP features that Kafka
> doesn't have.
>
> Basically I don't really see other messaging systems as being fully formed
> distributed systems that acts as a *cluster* (rather than an ensemble of
> brokers). Conceptually when people program to, say, HDFS, you largely
> forget that under the covers it is a collection of data nodes and you think
> about it as a single entity. There are a number of points in the design
> that make this possible (and a number of areas where HDFS falls short). I
> think there is a lot to be gained by bringing to bear this modern style of
> distributed systems design in this space. Needless to say people who work
> on these other systems totally disagree with this assessment, so it is a
> bit of an experiment.
>
> I think an interesting analogy is to databases. Relational databases took
> this path to some extent. They started out with a very diverse feature set,
> and eventually converged to a fairly standard set of functionality with
> reasonable compatibility protocols (ODBC, JDBC). Distributed databases,
> though, are much more constrained and virtually always fail when they
> attempt to be compatible with centralized RDBMS's because they just can't
> do all the same stuff (but can do other things). I think as the distributed
> database space settles down it will become clear how to provide some kind
> of general protocol to standardize access, but trying to do that too soon
> wouldn't really help.
>
> Another option, instead of making Kafka an AMPQ system, would be to try to
> make Kafka a multi-protocol system that supported many protocol's natively,
> sharing basic socket infrastructure. I have been down this path and it is a
> very hard road. I would not like to do that again.
>
> That said it would be very interesting to see how well AMQP could be mapped
> to Kafka semantics, and there is nothing that prevents this experiment from
> happening outside the main codebase. It is totally possible to just call
> new KafkaServer(), access all the business logic from there, and wrap that
> in AMQP, REST, or any other protocol. That might be a good way to conduct
> the experiment if anyone is interested in trying it.
>
> Cheers,
>
> -Jay
>
>
>
>
>
>
>
> On Mon, Oct 1, 2012 at 12:07 PM, William Henry <wh...@redhat.com> wrote:
>
> > Hi,
> >
> > Has anyone looked at this email?  Anyone care to express an opinion?
> >
> > It seems like Apache has ActiveMQ and Qpid, which are already working on
> > integrating, and now Kafka. Kafka might benefit by using Qpid/Proton just
> > as ActiveMQ is trying to integrate with Qpid/Proton.
> >
> > If folks are interested I'd be willing to take a look at the integration
> > and help out.
> >
> > Best regards,
> > William
> >
> > ----- Original Message -----
> > > Hi,
> > >
> > >
> > > Has anyone looked at integrating kafka with Apache Qpid to get AMQP
> > > support?
> > >
> > >
> > > Best,
> > > William
> >
>

Re: Integration with AMQP

Posted by William Henry <wh...@redhat.com>.

----- Original Message -----
> I looked into AMQP when I was first starting Kafka work. I see the
> crux of
> the issue as this: if you have a bunch of systems that essentially
> expose
> the same functionality there is value in standardizing the protocol
> by
> which they are accessed to help decouple interface from
> implementation. Of
> course I think it is better still to end up with a single good
> implementation (e.g. Linux rather than Posix). But invariably the
> protocol
> dictates the feature set, which dictates the implementation, and so
> this
> only really works if the systems have the same feature set and
> similar
> enough implementations. This becomes true in a domain over time as
> people
> learn the best way to build that kind of system, and all the systems
> converge to that.

+1 

> 
> The reason we have not been pursuing this is that I think the set of
> functionality we are aiming for is a little different than what most
> message brokers have. Basically the idea we have is to attempt to
> re-imagine "messaging" or asynchronous processing infrastructure as a
> distributed, replicated, partitioned "commit log". This is different
> enough
> from what other system do that attempting to support a standardized
> protocol is unlikely to work out well. For example, the consumer
> balancing
> we do is not modeled in AMQP, and there are many AMQP features that
> Kafka
> doesn't have.

I need to understand your consumer balancing a bit more but AMQP is designed not to be another MOM like traditional broker based messaging systems, though it does support that model. 

I like to explain the goals of AMQP to be threefold (some may argue differently):

1) A Standard wire protocol for interoperability.  i.e. have all messaging systems speak the same on the wire.
2) Handle all messaging use cases well - i.e. not just asynch, not just fanout, not just pub/sub but instead do it all so that AMQP is applicable to all use cases. Let's not have a "we do AMQP everywhere except X because it does do X very well.
3) Must be fast. Even if it does 1 and 2 very well it will not be adopted by a wide range of applications.

So if by consumer balancing you mean multiple consumers feeding off a particular address/source/publisher/producer etc. then AMQP does manage that model.


> 
> Basically I don't really see other messaging systems as being fully
> formed
> distributed systems that acts as a *cluster* (rather than an ensemble
> of
> brokers). 

This is exactly what we in the Qpid community are working towards right now.  I think AMQP as a protocol under Kafka and exploiting Kafka's framework is a great idea.

Please look at the new Qpid/Proton work and some of Ted Ross's (cc-ed) router work.

> Conceptually when people program to, say, HDFS, you largely
> forget that under the covers it is a collection of data nodes and you
> think
> about it as a single entity. There are a number of points in the
> design
> that make this possible (and a number of areas where HDFS falls
> short). I
> think there is a lot to be gained by bringing to bear this modern
> style of
> distributed systems design in this space. Needless to say people who
> work
> on these other systems totally disagree with this assessment, so it
> is a
> bit of an experiment.

This is very interesting to me and some of the customers (at least 2) I work with.

> 
> I think an interesting analogy is to databases. Relational databases
> took
> this path to some extent. They started out with a very diverse
> feature set,
> and eventually converged to a fairly standard set of functionality
> with
> reasonable compatibility protocols (ODBC, JDBC). Distributed
> databases,
> though, are much more constrained and virtually always fail when they
> attempt to be compatible with centralized RDBMS's because they just
> can't
> do all the same stuff (but can do other things). I think as the
> distributed
> database space settles down it will become clear how to provide some
> kind
> of general protocol to standardize access, but trying to do that too
> soon
> wouldn't really help.
> 
> Another option, instead of making Kafka an AMPQ system, would be to
> try to
> make Kafka a multi-protocol system that supported many protocol's
> natively,
> sharing basic socket infrastructure. I have been down this path and
> it is a
> very hard road. I would not like to do that again.

I understand that. 

> 
> That said it would be very interesting to see how well AMQP could be
> mapped
> to Kafka semantics, and there is nothing that prevents this
> experiment from
> happening outside the main codebase. It is totally possible to just
> call
> new KafkaServer(), access all the business logic from there, and wrap
> that
> in AMQP, REST, or any other protocol. That might be a good way to
> conduct
> the experiment if anyone is interested in trying it.
> 

I would love to take a look at this. Any pointer on where an integration point might be would be welcome.  There is so much work in the AMQP and Qpid communities that Kafka could benefit from. You could concentrate on the "cluster" model and let Qpid/Proton handle the payload distribution on the wire.

I'm willing to take the risk that I might be wrong but right now I don't see where AMQP would fall down in this case.

Best regards,
William

> Cheers,
> 
> -Jay
> 
> 
> 
> 
> 
> 
> 
> On Mon, Oct 1, 2012 at 12:07 PM, William Henry <wh...@redhat.com>
> wrote:
> 
> > Hi,
> >
> > Has anyone looked at this email?  Anyone care to express an
> > opinion?
> >
> > It seems like Apache has ActiveMQ and Qpid, which are already
> > working on
> > integrating, and now Kafka. Kafka might benefit by using
> > Qpid/Proton just
> > as ActiveMQ is trying to integrate with Qpid/Proton.
> >
> > If folks are interested I'd be willing to take a look at the
> > integration
> > and help out.
> >
> > Best regards,
> > William
> >
> > ----- Original Message -----
> > > Hi,
> > >
> > >
> > > Has anyone looked at integrating kafka with Apache Qpid to get
> > > AMQP
> > > support?
> > >
> > >
> > > Best,
> > > William
> >
> 

Re: Integration with AMQP

Posted by Jay Kreps <ja...@gmail.com>.
I looked into AMQP when I was first starting Kafka work. I see the crux of
the issue as this: if you have a bunch of systems that essentially expose
the same functionality there is value in standardizing the protocol by
which they are accessed to help decouple interface from implementation. Of
course I think it is better still to end up with a single good
implementation (e.g. Linux rather than Posix). But invariably the protocol
dictates the feature set, which dictates the implementation, and so this
only really works if the systems have the same feature set and similar
enough implementations. This becomes true in a domain over time as people
learn the best way to build that kind of system, and all the systems
converge to that.

The reason we have not been pursuing this is that I think the set of
functionality we are aiming for is a little different than what most
message brokers have. Basically the idea we have is to attempt to
re-imagine "messaging" or asynchronous processing infrastructure as a
distributed, replicated, partitioned "commit log". This is different enough
from what other system do that attempting to support a standardized
protocol is unlikely to work out well. For example, the consumer balancing
we do is not modeled in AMQP, and there are many AMQP features that Kafka
doesn't have.

Basically I don't really see other messaging systems as being fully formed
distributed systems that acts as a *cluster* (rather than an ensemble of
brokers). Conceptually when people program to, say, HDFS, you largely
forget that under the covers it is a collection of data nodes and you think
about it as a single entity. There are a number of points in the design
that make this possible (and a number of areas where HDFS falls short). I
think there is a lot to be gained by bringing to bear this modern style of
distributed systems design in this space. Needless to say people who work
on these other systems totally disagree with this assessment, so it is a
bit of an experiment.

I think an interesting analogy is to databases. Relational databases took
this path to some extent. They started out with a very diverse feature set,
and eventually converged to a fairly standard set of functionality with
reasonable compatibility protocols (ODBC, JDBC). Distributed databases,
though, are much more constrained and virtually always fail when they
attempt to be compatible with centralized RDBMS's because they just can't
do all the same stuff (but can do other things). I think as the distributed
database space settles down it will become clear how to provide some kind
of general protocol to standardize access, but trying to do that too soon
wouldn't really help.

Another option, instead of making Kafka an AMPQ system, would be to try to
make Kafka a multi-protocol system that supported many protocol's natively,
sharing basic socket infrastructure. I have been down this path and it is a
very hard road. I would not like to do that again.

That said it would be very interesting to see how well AMQP could be mapped
to Kafka semantics, and there is nothing that prevents this experiment from
happening outside the main codebase. It is totally possible to just call
new KafkaServer(), access all the business logic from there, and wrap that
in AMQP, REST, or any other protocol. That might be a good way to conduct
the experiment if anyone is interested in trying it.

Cheers,

-Jay







On Mon, Oct 1, 2012 at 12:07 PM, William Henry <wh...@redhat.com> wrote:

> Hi,
>
> Has anyone looked at this email?  Anyone care to express an opinion?
>
> It seems like Apache has ActiveMQ and Qpid, which are already working on
> integrating, and now Kafka. Kafka might benefit by using Qpid/Proton just
> as ActiveMQ is trying to integrate with Qpid/Proton.
>
> If folks are interested I'd be willing to take a look at the integration
> and help out.
>
> Best regards,
> William
>
> ----- Original Message -----
> > Hi,
> >
> >
> > Has anyone looked at integrating kafka with Apache Qpid to get AMQP
> > support?
> >
> >
> > Best,
> > William
>