You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Doug Tomm <dc...@gmail.com> on 2016/01/09 01:32:24 UTC

best python library to use?

we're using kafka-python, weighing pykafka, and wondering if there's 
another that is bettor to use.  does confluent endorse or recommend a 
particular python package (psorry for the alliteration)?

doug


Re: best python library to use?

Posted by Andrew Otto <ot...@wikimedia.org>.
I’m not the maintainer of either pykafka or librdkafka, so I can’t
realllLyyyy comment much on the benefit, but you may be right.  However,
librdkafka is well maintained and solid, so using it as the backing for a
Python client gets you the benefit of not having to reinvent features
yourself in Python.

New kafka-python stuff looks cool, I’m excited to try it out for producing
soon.

On Mon, Jan 11, 2016 at 12:37 PM, Dana Powers <da...@gmail.com> wrote:

> Agree - kafka-python was in hibernation waiting for 0.9.0.0 Kafka release,
> so a few issues lingered longer than I would have liked. Most of my
> comments relate to latest master, which we are hoping to release after a
> bit more testing and polish.
>
> Re librdkafka -- to be honest, I'm skeptical that C protocol bindings are
> going to improve python performance much. In my experience, the devil is in
> the client logic details, not the wire protocol parsing. Adding a C
> compilation step also adds installation and operational overhead (install
> gcc or manage linux wheels). So we have avoided adding that to kafka-python
> without significant evidence showing performance benefits that can't be
> duplicated in pure python.
>
> -Dana
> On Jan 11, 2016 9:02 AM, "Sam Pegler" <sa...@infectiousmedia.com>
> wrote:
>
> > kafka-python (https://github.com/dpkp/kafka-python) has also just merged
> > performance improvements to the consumer in
> > https://github.com/dpkp/kafka-python/issues/290 which should see a
> pretty
> > decent boost in throughput.  We were somewhat put off by the poor
> > performance in earlier versions, I imagine many people would have been in
> > the same position so it's worth revisiting.
> >
> > Sam Pegler
> >
> > WEBOPS ENGINEER T. +44(0) 07 562 867 486 [image: Infectious Media]3-7
> > Herbal Hill / London / EC1R 5EJwww.infectiousmedia.com [image:
> Infectious
> > Media] <http://www.infectiousmedia.com/>[image: Facebook]
> > <http://www.facebook.com/infectiousmedia>[image: Twitter]
> > <https://twitter.com/infectiousmedia>[image: LinkedIn]
> > <http://www.linkedin.com/company/infectious-media-ltd>[image: Youtube]
> > <http://www.youtube.com/user/InfectiousMediaLtd>   This email and any
> > attachments are confidential and may also be privileged. If youare not
> the
> > intended recipient, please notify the sender immediately, and do
> > notdisclose
> > the contents to another person, use it for any purpose, or store, or
> > copythe
> > information in any medium. Please also destroy and delete the message
> > fromyour
> > computer.
> >
> > On 11 January 2016 at 16:28, Andrew Otto <ot...@wikimedia.org> wrote:
> >
> > > pykafka’s balanced consumer is very useful. pykafka also has Python
> > > bindings to the librdkafka C library that you can optionally enable,
> > which
> > > might get you some speed boosts.
> > >
> > > python-kafka (oh, I just saw this 0.9x version, hm!) was better at
> > > producing than pykafka for us, so we am currently using pykafka for
> > > consumption, and python-kafka for production.  python-kafka allows you
> to
> > > produce to multiple topics using the same client instance.  (pykafka
> may
> > > support this soon: https://github.com/Parsely/pykafka/issues/354)
> > >
> > >
> > >
> > > On Sat, Jan 9, 2016 at 10:04 AM, Dana Powers <da...@gmail.com>
> > > wrote:
> > >
> > > > pykafka uses a custom zookeeper implementation for consumer groups.
> > > > kafka-python uses the 0.9.0.0 server apis to accomplish the same.
> > > >
> > > > -Dana
> > > > On Jan 8, 2016 18:32, "chengxin Cai" <ia...@outlook.com> wrote:
> > > >
> > > > > Hi
> > > > >
> > > > > I heard that Pykakfa can create a balanced consumer.
> > > > >
> > > > > And there should be no other big difference.
> > > > >
> > > > >
> > > > > Best Regards
> > > > >
> > > > > > 在 2016年1月9日,08:58,Dana Powers <da...@rd.io> 写道:
> > > > > >
> > > > > > Hi Doug,
> > > > > >
> > > > > > The differences are fairly subtle. kafka-python is a
> > community-backed
> > > > > > project that aims to be consistent w/ the official java client;
> > > pykafka
> > > > > is
> > > > > > sponsored by parse.ly and aims to provide a pythonic interface.
> > > > > whichever
> > > > > > you go with, I would love to hear your specific feedback on
> > > > kafka-python.
> > > > > >
> > > > > > -Dana (kafka-python maintainer)
> > > > > >
> > > > > >> On Fri, Jan 8, 2016 at 4:32 PM, Doug Tomm <dc...@gmail.com>
> > wrote:
> > > > > >>
> > > > > >> we're using kafka-python, weighing pykafka, and wondering if
> > there's
> > > > > >> another that is bettor to use.  does confluent endorse or
> > recommend
> > > a
> > > > > >> particular python package (psorry for the alliteration)?
> > > > > >>
> > > > > >> doug
> > > > > >>
> > > > > >>
> > > > >
> > > >
> > >
> >
>

Re: best python library to use?

Posted by Dana Powers <da...@gmail.com>.
Agree - kafka-python was in hibernation waiting for 0.9.0.0 Kafka release,
so a few issues lingered longer than I would have liked. Most of my
comments relate to latest master, which we are hoping to release after a
bit more testing and polish.

Re librdkafka -- to be honest, I'm skeptical that C protocol bindings are
going to improve python performance much. In my experience, the devil is in
the client logic details, not the wire protocol parsing. Adding a C
compilation step also adds installation and operational overhead (install
gcc or manage linux wheels). So we have avoided adding that to kafka-python
without significant evidence showing performance benefits that can't be
duplicated in pure python.

-Dana
On Jan 11, 2016 9:02 AM, "Sam Pegler" <sa...@infectiousmedia.com>
wrote:

> kafka-python (https://github.com/dpkp/kafka-python) has also just merged
> performance improvements to the consumer in
> https://github.com/dpkp/kafka-python/issues/290 which should see a pretty
> decent boost in throughput.  We were somewhat put off by the poor
> performance in earlier versions, I imagine many people would have been in
> the same position so it's worth revisiting.
>
> Sam Pegler
>
> WEBOPS ENGINEER T. +44(0) 07 562 867 486 [image: Infectious Media]3-7
> Herbal Hill / London / EC1R 5EJwww.infectiousmedia.com [image: Infectious
> Media] <http://www.infectiousmedia.com/>[image: Facebook]
> <http://www.facebook.com/infectiousmedia>[image: Twitter]
> <https://twitter.com/infectiousmedia>[image: LinkedIn]
> <http://www.linkedin.com/company/infectious-media-ltd>[image: Youtube]
> <http://www.youtube.com/user/InfectiousMediaLtd>   This email and any
> attachments are confidential and may also be privileged. If youare not the
> intended recipient, please notify the sender immediately, and do
> notdisclose
> the contents to another person, use it for any purpose, or store, or
> copythe
> information in any medium. Please also destroy and delete the message
> fromyour
> computer.
>
> On 11 January 2016 at 16:28, Andrew Otto <ot...@wikimedia.org> wrote:
>
> > pykafka’s balanced consumer is very useful. pykafka also has Python
> > bindings to the librdkafka C library that you can optionally enable,
> which
> > might get you some speed boosts.
> >
> > python-kafka (oh, I just saw this 0.9x version, hm!) was better at
> > producing than pykafka for us, so we am currently using pykafka for
> > consumption, and python-kafka for production.  python-kafka allows you to
> > produce to multiple topics using the same client instance.  (pykafka may
> > support this soon: https://github.com/Parsely/pykafka/issues/354)
> >
> >
> >
> > On Sat, Jan 9, 2016 at 10:04 AM, Dana Powers <da...@gmail.com>
> > wrote:
> >
> > > pykafka uses a custom zookeeper implementation for consumer groups.
> > > kafka-python uses the 0.9.0.0 server apis to accomplish the same.
> > >
> > > -Dana
> > > On Jan 8, 2016 18:32, "chengxin Cai" <ia...@outlook.com> wrote:
> > >
> > > > Hi
> > > >
> > > > I heard that Pykakfa can create a balanced consumer.
> > > >
> > > > And there should be no other big difference.
> > > >
> > > >
> > > > Best Regards
> > > >
> > > > > 在 2016年1月9日,08:58,Dana Powers <da...@rd.io> 写道:
> > > > >
> > > > > Hi Doug,
> > > > >
> > > > > The differences are fairly subtle. kafka-python is a
> community-backed
> > > > > project that aims to be consistent w/ the official java client;
> > pykafka
> > > > is
> > > > > sponsored by parse.ly and aims to provide a pythonic interface.
> > > > whichever
> > > > > you go with, I would love to hear your specific feedback on
> > > kafka-python.
> > > > >
> > > > > -Dana (kafka-python maintainer)
> > > > >
> > > > >> On Fri, Jan 8, 2016 at 4:32 PM, Doug Tomm <dc...@gmail.com>
> wrote:
> > > > >>
> > > > >> we're using kafka-python, weighing pykafka, and wondering if
> there's
> > > > >> another that is bettor to use.  does confluent endorse or
> recommend
> > a
> > > > >> particular python package (psorry for the alliteration)?
> > > > >>
> > > > >> doug
> > > > >>
> > > > >>
> > > >
> > >
> >
>

Re: best python library to use?

Posted by Sam Pegler <sa...@infectiousmedia.com>.
kafka-python (https://github.com/dpkp/kafka-python) has also just merged
performance improvements to the consumer in
https://github.com/dpkp/kafka-python/issues/290 which should see a pretty
decent boost in throughput.  We were somewhat put off by the poor
performance in earlier versions, I imagine many people would have been in
the same position so it's worth revisiting.

Sam Pegler

WEBOPS ENGINEER T. +44(0) 07 562 867 486 [image: Infectious Media]3-7
Herbal Hill / London / EC1R 5EJwww.infectiousmedia.com [image: Infectious
Media] <http://www.infectiousmedia.com/>[image: Facebook]
<http://www.facebook.com/infectiousmedia>[image: Twitter]
<https://twitter.com/infectiousmedia>[image: LinkedIn]
<http://www.linkedin.com/company/infectious-media-ltd>[image: Youtube]
<http://www.youtube.com/user/InfectiousMediaLtd>   This email and any
attachments are confidential and may also be privileged. If youare not the
intended recipient, please notify the sender immediately, and do notdisclose
the contents to another person, use it for any purpose, or store, or copythe
information in any medium. Please also destroy and delete the message fromyour
computer.

On 11 January 2016 at 16:28, Andrew Otto <ot...@wikimedia.org> wrote:

> pykafka’s balanced consumer is very useful. pykafka also has Python
> bindings to the librdkafka C library that you can optionally enable, which
> might get you some speed boosts.
>
> python-kafka (oh, I just saw this 0.9x version, hm!) was better at
> producing than pykafka for us, so we am currently using pykafka for
> consumption, and python-kafka for production.  python-kafka allows you to
> produce to multiple topics using the same client instance.  (pykafka may
> support this soon: https://github.com/Parsely/pykafka/issues/354)
>
>
>
> On Sat, Jan 9, 2016 at 10:04 AM, Dana Powers <da...@gmail.com>
> wrote:
>
> > pykafka uses a custom zookeeper implementation for consumer groups.
> > kafka-python uses the 0.9.0.0 server apis to accomplish the same.
> >
> > -Dana
> > On Jan 8, 2016 18:32, "chengxin Cai" <ia...@outlook.com> wrote:
> >
> > > Hi
> > >
> > > I heard that Pykakfa can create a balanced consumer.
> > >
> > > And there should be no other big difference.
> > >
> > >
> > > Best Regards
> > >
> > > > 在 2016年1月9日,08:58,Dana Powers <da...@rd.io> 写道:
> > > >
> > > > Hi Doug,
> > > >
> > > > The differences are fairly subtle. kafka-python is a community-backed
> > > > project that aims to be consistent w/ the official java client;
> pykafka
> > > is
> > > > sponsored by parse.ly and aims to provide a pythonic interface.
> > > whichever
> > > > you go with, I would love to hear your specific feedback on
> > kafka-python.
> > > >
> > > > -Dana (kafka-python maintainer)
> > > >
> > > >> On Fri, Jan 8, 2016 at 4:32 PM, Doug Tomm <dc...@gmail.com> wrote:
> > > >>
> > > >> we're using kafka-python, weighing pykafka, and wondering if there's
> > > >> another that is bettor to use.  does confluent endorse or recommend
> a
> > > >> particular python package (psorry for the alliteration)?
> > > >>
> > > >> doug
> > > >>
> > > >>
> > >
> >
>

Re: best python library to use?

Posted by Andrew Otto <ot...@wikimedia.org>.
pykafka’s balanced consumer is very useful. pykafka also has Python
bindings to the librdkafka C library that you can optionally enable, which
might get you some speed boosts.

python-kafka (oh, I just saw this 0.9x version, hm!) was better at
producing than pykafka for us, so we am currently using pykafka for
consumption, and python-kafka for production.  python-kafka allows you to
produce to multiple topics using the same client instance.  (pykafka may
support this soon: https://github.com/Parsely/pykafka/issues/354)



On Sat, Jan 9, 2016 at 10:04 AM, Dana Powers <da...@gmail.com> wrote:

> pykafka uses a custom zookeeper implementation for consumer groups.
> kafka-python uses the 0.9.0.0 server apis to accomplish the same.
>
> -Dana
> On Jan 8, 2016 18:32, "chengxin Cai" <ia...@outlook.com> wrote:
>
> > Hi
> >
> > I heard that Pykakfa can create a balanced consumer.
> >
> > And there should be no other big difference.
> >
> >
> > Best Regards
> >
> > > 在 2016年1月9日,08:58,Dana Powers <da...@rd.io> 写道:
> > >
> > > Hi Doug,
> > >
> > > The differences are fairly subtle. kafka-python is a community-backed
> > > project that aims to be consistent w/ the official java client; pykafka
> > is
> > > sponsored by parse.ly and aims to provide a pythonic interface.
> > whichever
> > > you go with, I would love to hear your specific feedback on
> kafka-python.
> > >
> > > -Dana (kafka-python maintainer)
> > >
> > >> On Fri, Jan 8, 2016 at 4:32 PM, Doug Tomm <dc...@gmail.com> wrote:
> > >>
> > >> we're using kafka-python, weighing pykafka, and wondering if there's
> > >> another that is bettor to use.  does confluent endorse or recommend a
> > >> particular python package (psorry for the alliteration)?
> > >>
> > >> doug
> > >>
> > >>
> >
>

Re: best python library to use?

Posted by Dana Powers <da...@gmail.com>.
pykafka uses a custom zookeeper implementation for consumer groups.
kafka-python uses the 0.9.0.0 server apis to accomplish the same.

-Dana
On Jan 8, 2016 18:32, "chengxin Cai" <ia...@outlook.com> wrote:

> Hi
>
> I heard that Pykakfa can create a balanced consumer.
>
> And there should be no other big difference.
>
>
> Best Regards
>
> > 在 2016年1月9日,08:58,Dana Powers <da...@rd.io> 写道:
> >
> > Hi Doug,
> >
> > The differences are fairly subtle. kafka-python is a community-backed
> > project that aims to be consistent w/ the official java client; pykafka
> is
> > sponsored by parse.ly and aims to provide a pythonic interface.
> whichever
> > you go with, I would love to hear your specific feedback on kafka-python.
> >
> > -Dana (kafka-python maintainer)
> >
> >> On Fri, Jan 8, 2016 at 4:32 PM, Doug Tomm <dc...@gmail.com> wrote:
> >>
> >> we're using kafka-python, weighing pykafka, and wondering if there's
> >> another that is bettor to use.  does confluent endorse or recommend a
> >> particular python package (psorry for the alliteration)?
> >>
> >> doug
> >>
> >>
>

Re: best python library to use?

Posted by chengxin Cai <ia...@outlook.com>.
Hi 

I heard that Pykakfa can create a balanced consumer.

And there should be no other big difference.


Best Regards

> 在 2016年1月9日,08:58,Dana Powers <da...@rd.io> 写道:
> 
> Hi Doug,
> 
> The differences are fairly subtle. kafka-python is a community-backed
> project that aims to be consistent w/ the official java client; pykafka is
> sponsored by parse.ly and aims to provide a pythonic interface. whichever
> you go with, I would love to hear your specific feedback on kafka-python.
> 
> -Dana (kafka-python maintainer)
> 
>> On Fri, Jan 8, 2016 at 4:32 PM, Doug Tomm <dc...@gmail.com> wrote:
>> 
>> we're using kafka-python, weighing pykafka, and wondering if there's
>> another that is bettor to use.  does confluent endorse or recommend a
>> particular python package (psorry for the alliteration)?
>> 
>> doug
>> 
>> 

Re: best python library to use?

Posted by Dana Powers <da...@rd.io>.
Hi Doug,

The differences are fairly subtle. kafka-python is a community-backed
project that aims to be consistent w/ the official java client; pykafka is
sponsored by parse.ly and aims to provide a pythonic interface. whichever
you go with, I would love to hear your specific feedback on kafka-python.

-Dana (kafka-python maintainer)

On Fri, Jan 8, 2016 at 4:32 PM, Doug Tomm <dc...@gmail.com> wrote:

> we're using kafka-python, weighing pykafka, and wondering if there's
> another that is bettor to use.  does confluent endorse or recommend a
> particular python package (psorry for the alliteration)?
>
> doug
>
>