You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by SenthilKumar K <se...@gmail.com> on 2017/06/21 12:58:27 UTC

Handling 2 to 3 Million Events before Kafka

Hi Team ,   Sorry if this question is irrelevant to Kafka Group ...

I have been trying to solve problem of handling 5 GB/sec ingestion. Kafka
is really good candidate for us to handle this ingestion rate ..


100K machines ----> { Http Server (Jetty/Netty) } --> Kafka Cluster..

I see the problem in Http Server where it can't handle beyond 50K events
per instance ..  I'm thinking some other solution would be right choice
before Kafka ..

Anyone worked on similar use case and similar load ? Suggestions/Thoughts ?

--Senthil

Re: Handling 2 to 3 Million Events before Kafka

Posted by Jeyhun Karimov <je...@gmail.com>.
Hi,

With kafka you can increase overall throughput  by increasing the number of
nodes in a cluster.
I had a similar issue, where we needed to ingest vast amounts of data to
streaming system.
In our case, kafka was a bottleneck, because of disk I/O. To solve it, we
implemented (simple) distributed pub-sub system with C which reside data in
memory. Also you should take account your network bandwidth and the
(upper-bound) capability of your processing engine or http server.


Cheers,
Jeyhun


On Wed, Jun 21, 2017 at 2:58 PM SenthilKumar K <se...@gmail.com>
wrote:

> Hi Team ,   Sorry if this question is irrelevant to Kafka Group ...
>
> I have been trying to solve problem of handling 5 GB/sec ingestion. Kafka
> is really good candidate for us to handle this ingestion rate ..
>
>
> 100K machines ----> { Http Server (Jetty/Netty) } --> Kafka Cluster..
>
> I see the problem in Http Server where it can't handle beyond 50K events
> per instance ..  I'm thinking some other solution would be right choice
> before Kafka ..
>
> Anyone worked on similar use case and similar load ? Suggestions/Thoughts ?
>
> --Senthil
>
-- 
-Cheers

Jeyhun

Re: Handling 2 to 3 Million Events before Kafka

Posted by Jeyhun Karimov <je...@gmail.com>.
Hi,

With kafka you can increase overall throughput  by increasing the number of
nodes in a cluster.
I had a similar issue, where we needed to ingest vast amounts of data to
streaming system.
In our case, kafka was a bottleneck, because of disk I/O. To solve it, we
implemented (simple) distributed pub-sub system with C which reside data in
memory. Also you should take account your network bandwidth and the
(upper-bound) capability of your processing engine or http server.


Cheers,
Jeyhun


On Wed, Jun 21, 2017 at 2:58 PM SenthilKumar K <se...@gmail.com>
wrote:

> Hi Team ,   Sorry if this question is irrelevant to Kafka Group ...
>
> I have been trying to solve problem of handling 5 GB/sec ingestion. Kafka
> is really good candidate for us to handle this ingestion rate ..
>
>
> 100K machines ----> { Http Server (Jetty/Netty) } --> Kafka Cluster..
>
> I see the problem in Http Server where it can't handle beyond 50K events
> per instance ..  I'm thinking some other solution would be right choice
> before Kafka ..
>
> Anyone worked on similar use case and similar load ? Suggestions/Thoughts ?
>
> --Senthil
>
-- 
-Cheers

Jeyhun

Re: Handling 2 to 3 Million Events before Kafka

Posted by SenthilKumar K <se...@gmail.com>.
Hi Barton -  I think we can use Async Producer with Call Back api(s) to
keep track on which event failed ..

--Senthil

On Thu, Jun 22, 2017 at 4:58 PM, SenthilKumar K <se...@gmail.com>
wrote:

> Thanks Barton.. I'll look into these ..
>
> On Thu, Jun 22, 2017 at 7:12 AM, Garrett Barton <ga...@gmail.com>
> wrote:
>
>> Getting good concurrency in a webapp is more than doable.  Check out
>> these benchmarks:
>> https://www.techempower.com/benchmarks/#section=data-r14&hw=ph&test=db
>> I linked to the single query one because thats closest to a single
>> operation like you will be doing.
>>
>> I'd also note if the data delivery does not need to be guaranteed you
>> could go faster switching the web servers over to UDP and using async mode
>> on the kafka producers.
>>
>> On Wed, Jun 21, 2017 at 2:23 PM, Tauzell, Dave <
>> Dave.Tauzell@surescripts.com> wrote:
>>
>>> I’m not really familiar with Netty so I won’t be of much help.   Maybe
>>> try posting on a Netty forum to see what they think?
>>> -Dave
>>>
>>> From: SenthilKumar K [mailto:senthilec566@gmail.com]
>>> Sent: Wednesday, June 21, 2017 10:28 AM
>>> To: Tauzell, Dave
>>> Cc: users@kafka.apache.org; senthilec566@apache.org;
>>> dev@kafka.apache.org
>>> Subject: Re: Handling 2 to 3 Million Events before Kafka
>>>
>>> So netty would work for this case ?  I do have netty server and seems to
>>> be i'm not getting the expected results .. here is the git
>>> https://github.com/senthilec566/netty4-server , is this right
>>> implementation ?
>>>
>>> Cheers,
>>> Senthil
>>>
>>> On Wed, Jun 21, 2017 at 7:45 PM, Tauzell, Dave <
>>> Dave.Tauzell@surescripts.com<ma...@surescripts.com>>
>>> wrote:
>>> I see.
>>>
>>> 1.       You don’t want the 100k machines sending directly to kafka.
>>>
>>> 2.       You can only have a small number of web servers
>>>
>>> People certainly have web-servers handling over 100k concurrent
>>> connections.  See this for some examples:
>>> https://github.com/smallnest/C1000K-Servers .
>>>
>>> It seems possible with the right sort of kafka producer tuning.
>>>
>>> -Dave
>>>
>>> From: SenthilKumar K [mailto:senthilec566@gmail.com<mailto:
>>> senthilec566@gmail.com>]
>>> Sent: Wednesday, June 21, 2017 8:55 AM
>>> To: Tauzell, Dave
>>> Cc: users@kafka.apache.org<ma...@kafka.apache.org>;
>>> senthilec566@apache.org<ma...@apache.org>;
>>> dev@kafka.apache.org<ma...@kafka.apache.org>; Senthil kumar
>>> Subject: Re: Handling 2 to 3 Million Events before Kafka
>>>
>>> Thanks Jeyhun. Yes http server would be problematic here w.r.t network ,
>>> memory ..
>>>
>>> Hi Dave ,  The problem is not with Kafka , it's all about how do you
>>> handle huge data before kafka.  I did a simple test with 5 node Kafka
>>> Cluster which gives good result ( ~950 MB/s ) ..So Kafka side i dont see a
>>> scaling issue ...
>>>
>>> All we are trying is before kafka how do we handle messages from
>>> different servers ...  Webservers can send fast to kafka but still i can
>>> handle only 50k events per second which is less for my use case.. also i
>>> can't deploy 20 webservers to handle this load. I'm looking for an option
>>> what could be the best candidate before kafka , it should be super fast in
>>> getting all and send it to kafka producer ..
>>>
>>>
>>> --Senthil
>>>
>>> On Wed, Jun 21, 2017 at 6:53 PM, Tauzell, Dave <
>>> Dave.Tauzell@surescripts.com<ma...@surescripts.com>>
>>> wrote:
>>> What are your configurations?
>>>
>>> - production
>>> - brokers
>>> - consumers
>>>
>>> Is the problem that web servers cannot send to Kafka fast enough or your
>>> consumers cannot process messages off of kafka fast enough?
>>> What is the average size of these messages?
>>>
>>> -Dave
>>>
>>> -----Original Message-----
>>> From: SenthilKumar K [mailto:senthilec566@gmail.com<mailto:
>>> senthilec566@gmail.com>]
>>> Sent: Wednesday, June 21, 2017 7:58 AM
>>> To: users@kafka.apache.org<ma...@kafka.apache.org>
>>> Cc: senthilec566@apache.org<ma...@apache.org>; Senthil
>>> kumar; dev@kafka.apache.org<ma...@kafka.apache.org>
>>> Subject: Handling 2 to 3 Million Events before Kafka
>>>
>>> Hi Team ,   Sorry if this question is irrelevant to Kafka Group ...
>>>
>>> I have been trying to solve problem of handling 5 GB/sec ingestion.
>>> Kafka is really good candidate for us to handle this ingestion rate ..
>>>
>>>
>>> 100K machines ----> { Http Server (Jetty/Netty) } --> Kafka Cluster..
>>>
>>> I see the problem in Http Server where it can't handle beyond 50K events
>>> per instance ..  I'm thinking some other solution would be right choice
>>> before Kafka ..
>>>
>>> Anyone worked on similar use case and similar load ?
>>> Suggestions/Thoughts ?
>>>
>>> --Senthil
>>> This e-mail and any files transmitted with it are confidential, may
>>> contain sensitive information, and are intended solely for the use of the
>>> individual or entity to whom they are addressed. If you have received this
>>> e-mail in error, please notify the sender by reply e-mail immediately and
>>> destroy all copies of the e-mail and any attachments.
>>>
>>>
>>>
>>
>

Re: Handling 2 to 3 Million Events before Kafka

Posted by SenthilKumar K <se...@gmail.com>.
Hi Barton -  I think we can use Async Producer with Call Back api(s) to
keep track on which event failed ..

--Senthil

On Thu, Jun 22, 2017 at 4:58 PM, SenthilKumar K <se...@gmail.com>
wrote:

> Thanks Barton.. I'll look into these ..
>
> On Thu, Jun 22, 2017 at 7:12 AM, Garrett Barton <ga...@gmail.com>
> wrote:
>
>> Getting good concurrency in a webapp is more than doable.  Check out
>> these benchmarks:
>> https://www.techempower.com/benchmarks/#section=data-r14&hw=ph&test=db
>> I linked to the single query one because thats closest to a single
>> operation like you will be doing.
>>
>> I'd also note if the data delivery does not need to be guaranteed you
>> could go faster switching the web servers over to UDP and using async mode
>> on the kafka producers.
>>
>> On Wed, Jun 21, 2017 at 2:23 PM, Tauzell, Dave <
>> Dave.Tauzell@surescripts.com> wrote:
>>
>>> I’m not really familiar with Netty so I won’t be of much help.   Maybe
>>> try posting on a Netty forum to see what they think?
>>> -Dave
>>>
>>> From: SenthilKumar K [mailto:senthilec566@gmail.com]
>>> Sent: Wednesday, June 21, 2017 10:28 AM
>>> To: Tauzell, Dave
>>> Cc: users@kafka.apache.org; senthilec566@apache.org;
>>> dev@kafka.apache.org
>>> Subject: Re: Handling 2 to 3 Million Events before Kafka
>>>
>>> So netty would work for this case ?  I do have netty server and seems to
>>> be i'm not getting the expected results .. here is the git
>>> https://github.com/senthilec566/netty4-server , is this right
>>> implementation ?
>>>
>>> Cheers,
>>> Senthil
>>>
>>> On Wed, Jun 21, 2017 at 7:45 PM, Tauzell, Dave <
>>> Dave.Tauzell@surescripts.com<ma...@surescripts.com>>
>>> wrote:
>>> I see.
>>>
>>> 1.       You don’t want the 100k machines sending directly to kafka.
>>>
>>> 2.       You can only have a small number of web servers
>>>
>>> People certainly have web-servers handling over 100k concurrent
>>> connections.  See this for some examples:
>>> https://github.com/smallnest/C1000K-Servers .
>>>
>>> It seems possible with the right sort of kafka producer tuning.
>>>
>>> -Dave
>>>
>>> From: SenthilKumar K [mailto:senthilec566@gmail.com<mailto:
>>> senthilec566@gmail.com>]
>>> Sent: Wednesday, June 21, 2017 8:55 AM
>>> To: Tauzell, Dave
>>> Cc: users@kafka.apache.org<ma...@kafka.apache.org>;
>>> senthilec566@apache.org<ma...@apache.org>;
>>> dev@kafka.apache.org<ma...@kafka.apache.org>; Senthil kumar
>>> Subject: Re: Handling 2 to 3 Million Events before Kafka
>>>
>>> Thanks Jeyhun. Yes http server would be problematic here w.r.t network ,
>>> memory ..
>>>
>>> Hi Dave ,  The problem is not with Kafka , it's all about how do you
>>> handle huge data before kafka.  I did a simple test with 5 node Kafka
>>> Cluster which gives good result ( ~950 MB/s ) ..So Kafka side i dont see a
>>> scaling issue ...
>>>
>>> All we are trying is before kafka how do we handle messages from
>>> different servers ...  Webservers can send fast to kafka but still i can
>>> handle only 50k events per second which is less for my use case.. also i
>>> can't deploy 20 webservers to handle this load. I'm looking for an option
>>> what could be the best candidate before kafka , it should be super fast in
>>> getting all and send it to kafka producer ..
>>>
>>>
>>> --Senthil
>>>
>>> On Wed, Jun 21, 2017 at 6:53 PM, Tauzell, Dave <
>>> Dave.Tauzell@surescripts.com<ma...@surescripts.com>>
>>> wrote:
>>> What are your configurations?
>>>
>>> - production
>>> - brokers
>>> - consumers
>>>
>>> Is the problem that web servers cannot send to Kafka fast enough or your
>>> consumers cannot process messages off of kafka fast enough?
>>> What is the average size of these messages?
>>>
>>> -Dave
>>>
>>> -----Original Message-----
>>> From: SenthilKumar K [mailto:senthilec566@gmail.com<mailto:
>>> senthilec566@gmail.com>]
>>> Sent: Wednesday, June 21, 2017 7:58 AM
>>> To: users@kafka.apache.org<ma...@kafka.apache.org>
>>> Cc: senthilec566@apache.org<ma...@apache.org>; Senthil
>>> kumar; dev@kafka.apache.org<ma...@kafka.apache.org>
>>> Subject: Handling 2 to 3 Million Events before Kafka
>>>
>>> Hi Team ,   Sorry if this question is irrelevant to Kafka Group ...
>>>
>>> I have been trying to solve problem of handling 5 GB/sec ingestion.
>>> Kafka is really good candidate for us to handle this ingestion rate ..
>>>
>>>
>>> 100K machines ----> { Http Server (Jetty/Netty) } --> Kafka Cluster..
>>>
>>> I see the problem in Http Server where it can't handle beyond 50K events
>>> per instance ..  I'm thinking some other solution would be right choice
>>> before Kafka ..
>>>
>>> Anyone worked on similar use case and similar load ?
>>> Suggestions/Thoughts ?
>>>
>>> --Senthil
>>> This e-mail and any files transmitted with it are confidential, may
>>> contain sensitive information, and are intended solely for the use of the
>>> individual or entity to whom they are addressed. If you have received this
>>> e-mail in error, please notify the sender by reply e-mail immediately and
>>> destroy all copies of the e-mail and any attachments.
>>>
>>>
>>>
>>
>

Re: Handling 2 to 3 Million Events before Kafka

Posted by SenthilKumar K <se...@gmail.com>.
Thanks Barton.. I'll look into these ..

On Thu, Jun 22, 2017 at 7:12 AM, Garrett Barton <ga...@gmail.com>
wrote:

> Getting good concurrency in a webapp is more than doable.  Check out these
> benchmarks:
> https://www.techempower.com/benchmarks/#section=data-r14&hw=ph&test=db
> I linked to the single query one because thats closest to a single
> operation like you will be doing.
>
> I'd also note if the data delivery does not need to be guaranteed you
> could go faster switching the web servers over to UDP and using async mode
> on the kafka producers.
>
> On Wed, Jun 21, 2017 at 2:23 PM, Tauzell, Dave <
> Dave.Tauzell@surescripts.com> wrote:
>
>> I’m not really familiar with Netty so I won’t be of much help.   Maybe
>> try posting on a Netty forum to see what they think?
>> -Dave
>>
>> From: SenthilKumar K [mailto:senthilec566@gmail.com]
>> Sent: Wednesday, June 21, 2017 10:28 AM
>> To: Tauzell, Dave
>> Cc: users@kafka.apache.org; senthilec566@apache.org; dev@kafka.apache.org
>> Subject: Re: Handling 2 to 3 Million Events before Kafka
>>
>> So netty would work for this case ?  I do have netty server and seems to
>> be i'm not getting the expected results .. here is the git
>> https://github.com/senthilec566/netty4-server , is this right
>> implementation ?
>>
>> Cheers,
>> Senthil
>>
>> On Wed, Jun 21, 2017 at 7:45 PM, Tauzell, Dave <
>> Dave.Tauzell@surescripts.com<ma...@surescripts.com>> wrote:
>> I see.
>>
>> 1.       You don’t want the 100k machines sending directly to kafka.
>>
>> 2.       You can only have a small number of web servers
>>
>> People certainly have web-servers handling over 100k concurrent
>> connections.  See this for some examples:  https://github.com/smallnest/C
>> 1000K-Servers .
>>
>> It seems possible with the right sort of kafka producer tuning.
>>
>> -Dave
>>
>> From: SenthilKumar K [mailto:senthilec566@gmail.com<mailto:
>> senthilec566@gmail.com>]
>> Sent: Wednesday, June 21, 2017 8:55 AM
>> To: Tauzell, Dave
>> Cc: users@kafka.apache.org<ma...@kafka.apache.org>;
>> senthilec566@apache.org<ma...@apache.org>;
>> dev@kafka.apache.org<ma...@kafka.apache.org>; Senthil kumar
>> Subject: Re: Handling 2 to 3 Million Events before Kafka
>>
>> Thanks Jeyhun. Yes http server would be problematic here w.r.t network ,
>> memory ..
>>
>> Hi Dave ,  The problem is not with Kafka , it's all about how do you
>> handle huge data before kafka.  I did a simple test with 5 node Kafka
>> Cluster which gives good result ( ~950 MB/s ) ..So Kafka side i dont see a
>> scaling issue ...
>>
>> All we are trying is before kafka how do we handle messages from
>> different servers ...  Webservers can send fast to kafka but still i can
>> handle only 50k events per second which is less for my use case.. also i
>> can't deploy 20 webservers to handle this load. I'm looking for an option
>> what could be the best candidate before kafka , it should be super fast in
>> getting all and send it to kafka producer ..
>>
>>
>> --Senthil
>>
>> On Wed, Jun 21, 2017 at 6:53 PM, Tauzell, Dave <
>> Dave.Tauzell@surescripts.com<ma...@surescripts.com>> wrote:
>> What are your configurations?
>>
>> - production
>> - brokers
>> - consumers
>>
>> Is the problem that web servers cannot send to Kafka fast enough or your
>> consumers cannot process messages off of kafka fast enough?
>> What is the average size of these messages?
>>
>> -Dave
>>
>> -----Original Message-----
>> From: SenthilKumar K [mailto:senthilec566@gmail.com<mailto:
>> senthilec566@gmail.com>]
>> Sent: Wednesday, June 21, 2017 7:58 AM
>> To: users@kafka.apache.org<ma...@kafka.apache.org>
>> Cc: senthilec566@apache.org<ma...@apache.org>; Senthil
>> kumar; dev@kafka.apache.org<ma...@kafka.apache.org>
>> Subject: Handling 2 to 3 Million Events before Kafka
>>
>> Hi Team ,   Sorry if this question is irrelevant to Kafka Group ...
>>
>> I have been trying to solve problem of handling 5 GB/sec ingestion. Kafka
>> is really good candidate for us to handle this ingestion rate ..
>>
>>
>> 100K machines ----> { Http Server (Jetty/Netty) } --> Kafka Cluster..
>>
>> I see the problem in Http Server where it can't handle beyond 50K events
>> per instance ..  I'm thinking some other solution would be right choice
>> before Kafka ..
>>
>> Anyone worked on similar use case and similar load ? Suggestions/Thoughts
>> ?
>>
>> --Senthil
>> This e-mail and any files transmitted with it are confidential, may
>> contain sensitive information, and are intended solely for the use of the
>> individual or entity to whom they are addressed. If you have received this
>> e-mail in error, please notify the sender by reply e-mail immediately and
>> destroy all copies of the e-mail and any attachments.
>>
>>
>>
>

Re: Handling 2 to 3 Million Events before Kafka

Posted by SenthilKumar K <se...@gmail.com>.
Thanks Barton.. I'll look into these ..

On Thu, Jun 22, 2017 at 7:12 AM, Garrett Barton <ga...@gmail.com>
wrote:

> Getting good concurrency in a webapp is more than doable.  Check out these
> benchmarks:
> https://www.techempower.com/benchmarks/#section=data-r14&hw=ph&test=db
> I linked to the single query one because thats closest to a single
> operation like you will be doing.
>
> I'd also note if the data delivery does not need to be guaranteed you
> could go faster switching the web servers over to UDP and using async mode
> on the kafka producers.
>
> On Wed, Jun 21, 2017 at 2:23 PM, Tauzell, Dave <
> Dave.Tauzell@surescripts.com> wrote:
>
>> I’m not really familiar with Netty so I won’t be of much help.   Maybe
>> try posting on a Netty forum to see what they think?
>> -Dave
>>
>> From: SenthilKumar K [mailto:senthilec566@gmail.com]
>> Sent: Wednesday, June 21, 2017 10:28 AM
>> To: Tauzell, Dave
>> Cc: users@kafka.apache.org; senthilec566@apache.org; dev@kafka.apache.org
>> Subject: Re: Handling 2 to 3 Million Events before Kafka
>>
>> So netty would work for this case ?  I do have netty server and seems to
>> be i'm not getting the expected results .. here is the git
>> https://github.com/senthilec566/netty4-server , is this right
>> implementation ?
>>
>> Cheers,
>> Senthil
>>
>> On Wed, Jun 21, 2017 at 7:45 PM, Tauzell, Dave <
>> Dave.Tauzell@surescripts.com<ma...@surescripts.com>> wrote:
>> I see.
>>
>> 1.       You don’t want the 100k machines sending directly to kafka.
>>
>> 2.       You can only have a small number of web servers
>>
>> People certainly have web-servers handling over 100k concurrent
>> connections.  See this for some examples:  https://github.com/smallnest/C
>> 1000K-Servers .
>>
>> It seems possible with the right sort of kafka producer tuning.
>>
>> -Dave
>>
>> From: SenthilKumar K [mailto:senthilec566@gmail.com<mailto:
>> senthilec566@gmail.com>]
>> Sent: Wednesday, June 21, 2017 8:55 AM
>> To: Tauzell, Dave
>> Cc: users@kafka.apache.org<ma...@kafka.apache.org>;
>> senthilec566@apache.org<ma...@apache.org>;
>> dev@kafka.apache.org<ma...@kafka.apache.org>; Senthil kumar
>> Subject: Re: Handling 2 to 3 Million Events before Kafka
>>
>> Thanks Jeyhun. Yes http server would be problematic here w.r.t network ,
>> memory ..
>>
>> Hi Dave ,  The problem is not with Kafka , it's all about how do you
>> handle huge data before kafka.  I did a simple test with 5 node Kafka
>> Cluster which gives good result ( ~950 MB/s ) ..So Kafka side i dont see a
>> scaling issue ...
>>
>> All we are trying is before kafka how do we handle messages from
>> different servers ...  Webservers can send fast to kafka but still i can
>> handle only 50k events per second which is less for my use case.. also i
>> can't deploy 20 webservers to handle this load. I'm looking for an option
>> what could be the best candidate before kafka , it should be super fast in
>> getting all and send it to kafka producer ..
>>
>>
>> --Senthil
>>
>> On Wed, Jun 21, 2017 at 6:53 PM, Tauzell, Dave <
>> Dave.Tauzell@surescripts.com<ma...@surescripts.com>> wrote:
>> What are your configurations?
>>
>> - production
>> - brokers
>> - consumers
>>
>> Is the problem that web servers cannot send to Kafka fast enough or your
>> consumers cannot process messages off of kafka fast enough?
>> What is the average size of these messages?
>>
>> -Dave
>>
>> -----Original Message-----
>> From: SenthilKumar K [mailto:senthilec566@gmail.com<mailto:
>> senthilec566@gmail.com>]
>> Sent: Wednesday, June 21, 2017 7:58 AM
>> To: users@kafka.apache.org<ma...@kafka.apache.org>
>> Cc: senthilec566@apache.org<ma...@apache.org>; Senthil
>> kumar; dev@kafka.apache.org<ma...@kafka.apache.org>
>> Subject: Handling 2 to 3 Million Events before Kafka
>>
>> Hi Team ,   Sorry if this question is irrelevant to Kafka Group ...
>>
>> I have been trying to solve problem of handling 5 GB/sec ingestion. Kafka
>> is really good candidate for us to handle this ingestion rate ..
>>
>>
>> 100K machines ----> { Http Server (Jetty/Netty) } --> Kafka Cluster..
>>
>> I see the problem in Http Server where it can't handle beyond 50K events
>> per instance ..  I'm thinking some other solution would be right choice
>> before Kafka ..
>>
>> Anyone worked on similar use case and similar load ? Suggestions/Thoughts
>> ?
>>
>> --Senthil
>> This e-mail and any files transmitted with it are confidential, may
>> contain sensitive information, and are intended solely for the use of the
>> individual or entity to whom they are addressed. If you have received this
>> e-mail in error, please notify the sender by reply e-mail immediately and
>> destroy all copies of the e-mail and any attachments.
>>
>>
>>
>

Re: Handling 2 to 3 Million Events before Kafka

Posted by Garrett Barton <ga...@gmail.com>.
Getting good concurrency in a webapp is more than doable.  Check out these
benchmarks:
https://www.techempower.com/benchmarks/#section=data-r14&hw=ph&test=db
I linked to the single query one because thats closest to a single
operation like you will be doing.

I'd also note if the data delivery does not need to be guaranteed you could
go faster switching the web servers over to UDP and using async mode on the
kafka producers.

On Wed, Jun 21, 2017 at 2:23 PM, Tauzell, Dave <Dave.Tauzell@surescripts.com
> wrote:

> I’m not really familiar with Netty so I won’t be of much help.   Maybe try
> posting on a Netty forum to see what they think?
> -Dave
>
> From: SenthilKumar K [mailto:senthilec566@gmail.com]
> Sent: Wednesday, June 21, 2017 10:28 AM
> To: Tauzell, Dave
> Cc: users@kafka.apache.org; senthilec566@apache.org; dev@kafka.apache.org
> Subject: Re: Handling 2 to 3 Million Events before Kafka
>
> So netty would work for this case ?  I do have netty server and seems to
> be i'm not getting the expected results .. here is the git
> https://github.com/senthilec566/netty4-server , is this right
> implementation ?
>
> Cheers,
> Senthil
>
> On Wed, Jun 21, 2017 at 7:45 PM, Tauzell, Dave <
> Dave.Tauzell@surescripts.com<ma...@surescripts.com>> wrote:
> I see.
>
> 1.       You don’t want the 100k machines sending directly to kafka.
>
> 2.       You can only have a small number of web servers
>
> People certainly have web-servers handling over 100k concurrent
> connections.  See this for some examples:  https://github.com/smallnest/
> C1000K-Servers .
>
> It seems possible with the right sort of kafka producer tuning.
>
> -Dave
>
> From: SenthilKumar K [mailto:senthilec566@gmail.com<mailto:
> senthilec566@gmail.com>]
> Sent: Wednesday, June 21, 2017 8:55 AM
> To: Tauzell, Dave
> Cc: users@kafka.apache.org<ma...@kafka.apache.org>;
> senthilec566@apache.org<ma...@apache.org>;
> dev@kafka.apache.org<ma...@kafka.apache.org>; Senthil kumar
> Subject: Re: Handling 2 to 3 Million Events before Kafka
>
> Thanks Jeyhun. Yes http server would be problematic here w.r.t network ,
> memory ..
>
> Hi Dave ,  The problem is not with Kafka , it's all about how do you
> handle huge data before kafka.  I did a simple test with 5 node Kafka
> Cluster which gives good result ( ~950 MB/s ) ..So Kafka side i dont see a
> scaling issue ...
>
> All we are trying is before kafka how do we handle messages from different
> servers ...  Webservers can send fast to kafka but still i can handle only
> 50k events per second which is less for my use case.. also i can't deploy
> 20 webservers to handle this load. I'm looking for an option what could be
> the best candidate before kafka , it should be super fast in getting all
> and send it to kafka producer ..
>
>
> --Senthil
>
> On Wed, Jun 21, 2017 at 6:53 PM, Tauzell, Dave <
> Dave.Tauzell@surescripts.com<ma...@surescripts.com>> wrote:
> What are your configurations?
>
> - production
> - brokers
> - consumers
>
> Is the problem that web servers cannot send to Kafka fast enough or your
> consumers cannot process messages off of kafka fast enough?
> What is the average size of these messages?
>
> -Dave
>
> -----Original Message-----
> From: SenthilKumar K [mailto:senthilec566@gmail.com<mailto:
> senthilec566@gmail.com>]
> Sent: Wednesday, June 21, 2017 7:58 AM
> To: users@kafka.apache.org<ma...@kafka.apache.org>
> Cc: senthilec566@apache.org<ma...@apache.org>; Senthil
> kumar; dev@kafka.apache.org<ma...@kafka.apache.org>
> Subject: Handling 2 to 3 Million Events before Kafka
>
> Hi Team ,   Sorry if this question is irrelevant to Kafka Group ...
>
> I have been trying to solve problem of handling 5 GB/sec ingestion. Kafka
> is really good candidate for us to handle this ingestion rate ..
>
>
> 100K machines ----> { Http Server (Jetty/Netty) } --> Kafka Cluster..
>
> I see the problem in Http Server where it can't handle beyond 50K events
> per instance ..  I'm thinking some other solution would be right choice
> before Kafka ..
>
> Anyone worked on similar use case and similar load ? Suggestions/Thoughts ?
>
> --Senthil
> This e-mail and any files transmitted with it are confidential, may
> contain sensitive information, and are intended solely for the use of the
> individual or entity to whom they are addressed. If you have received this
> e-mail in error, please notify the sender by reply e-mail immediately and
> destroy all copies of the e-mail and any attachments.
>
>
>

Re: Handling 2 to 3 Million Events before Kafka

Posted by Garrett Barton <ga...@gmail.com>.
Getting good concurrency in a webapp is more than doable.  Check out these
benchmarks:
https://www.techempower.com/benchmarks/#section=data-r14&hw=ph&test=db
I linked to the single query one because thats closest to a single
operation like you will be doing.

I'd also note if the data delivery does not need to be guaranteed you could
go faster switching the web servers over to UDP and using async mode on the
kafka producers.

On Wed, Jun 21, 2017 at 2:23 PM, Tauzell, Dave <Dave.Tauzell@surescripts.com
> wrote:

> I’m not really familiar with Netty so I won’t be of much help.   Maybe try
> posting on a Netty forum to see what they think?
> -Dave
>
> From: SenthilKumar K [mailto:senthilec566@gmail.com]
> Sent: Wednesday, June 21, 2017 10:28 AM
> To: Tauzell, Dave
> Cc: users@kafka.apache.org; senthilec566@apache.org; dev@kafka.apache.org
> Subject: Re: Handling 2 to 3 Million Events before Kafka
>
> So netty would work for this case ?  I do have netty server and seems to
> be i'm not getting the expected results .. here is the git
> https://github.com/senthilec566/netty4-server , is this right
> implementation ?
>
> Cheers,
> Senthil
>
> On Wed, Jun 21, 2017 at 7:45 PM, Tauzell, Dave <
> Dave.Tauzell@surescripts.com<ma...@surescripts.com>> wrote:
> I see.
>
> 1.       You don’t want the 100k machines sending directly to kafka.
>
> 2.       You can only have a small number of web servers
>
> People certainly have web-servers handling over 100k concurrent
> connections.  See this for some examples:  https://github.com/smallnest/
> C1000K-Servers .
>
> It seems possible with the right sort of kafka producer tuning.
>
> -Dave
>
> From: SenthilKumar K [mailto:senthilec566@gmail.com<mailto:
> senthilec566@gmail.com>]
> Sent: Wednesday, June 21, 2017 8:55 AM
> To: Tauzell, Dave
> Cc: users@kafka.apache.org<ma...@kafka.apache.org>;
> senthilec566@apache.org<ma...@apache.org>;
> dev@kafka.apache.org<ma...@kafka.apache.org>; Senthil kumar
> Subject: Re: Handling 2 to 3 Million Events before Kafka
>
> Thanks Jeyhun. Yes http server would be problematic here w.r.t network ,
> memory ..
>
> Hi Dave ,  The problem is not with Kafka , it's all about how do you
> handle huge data before kafka.  I did a simple test with 5 node Kafka
> Cluster which gives good result ( ~950 MB/s ) ..So Kafka side i dont see a
> scaling issue ...
>
> All we are trying is before kafka how do we handle messages from different
> servers ...  Webservers can send fast to kafka but still i can handle only
> 50k events per second which is less for my use case.. also i can't deploy
> 20 webservers to handle this load. I'm looking for an option what could be
> the best candidate before kafka , it should be super fast in getting all
> and send it to kafka producer ..
>
>
> --Senthil
>
> On Wed, Jun 21, 2017 at 6:53 PM, Tauzell, Dave <
> Dave.Tauzell@surescripts.com<ma...@surescripts.com>> wrote:
> What are your configurations?
>
> - production
> - brokers
> - consumers
>
> Is the problem that web servers cannot send to Kafka fast enough or your
> consumers cannot process messages off of kafka fast enough?
> What is the average size of these messages?
>
> -Dave
>
> -----Original Message-----
> From: SenthilKumar K [mailto:senthilec566@gmail.com<mailto:
> senthilec566@gmail.com>]
> Sent: Wednesday, June 21, 2017 7:58 AM
> To: users@kafka.apache.org<ma...@kafka.apache.org>
> Cc: senthilec566@apache.org<ma...@apache.org>; Senthil
> kumar; dev@kafka.apache.org<ma...@kafka.apache.org>
> Subject: Handling 2 to 3 Million Events before Kafka
>
> Hi Team ,   Sorry if this question is irrelevant to Kafka Group ...
>
> I have been trying to solve problem of handling 5 GB/sec ingestion. Kafka
> is really good candidate for us to handle this ingestion rate ..
>
>
> 100K machines ----> { Http Server (Jetty/Netty) } --> Kafka Cluster..
>
> I see the problem in Http Server where it can't handle beyond 50K events
> per instance ..  I'm thinking some other solution would be right choice
> before Kafka ..
>
> Anyone worked on similar use case and similar load ? Suggestions/Thoughts ?
>
> --Senthil
> This e-mail and any files transmitted with it are confidential, may
> contain sensitive information, and are intended solely for the use of the
> individual or entity to whom they are addressed. If you have received this
> e-mail in error, please notify the sender by reply e-mail immediately and
> destroy all copies of the e-mail and any attachments.
>
>
>

RE: Handling 2 to 3 Million Events before Kafka

Posted by "Tauzell, Dave" <Da...@surescripts.com>.
I’m not really familiar with Netty so I won’t be of much help.   Maybe try posting on a Netty forum to see what they think?
-Dave

From: SenthilKumar K [mailto:senthilec566@gmail.com]
Sent: Wednesday, June 21, 2017 10:28 AM
To: Tauzell, Dave
Cc: users@kafka.apache.org; senthilec566@apache.org; dev@kafka.apache.org
Subject: Re: Handling 2 to 3 Million Events before Kafka

So netty would work for this case ?  I do have netty server and seems to be i'm not getting the expected results .. here is the git https://github.com/senthilec566/netty4-server , is this right implementation ?

Cheers,
Senthil

On Wed, Jun 21, 2017 at 7:45 PM, Tauzell, Dave <Da...@surescripts.com>> wrote:
I see.

1.       You don’t want the 100k machines sending directly to kafka.

2.       You can only have a small number of web servers

People certainly have web-servers handling over 100k concurrent connections.  See this for some examples:  https://github.com/smallnest/C1000K-Servers .

It seems possible with the right sort of kafka producer tuning.

-Dave

From: SenthilKumar K [mailto:senthilec566@gmail.com<ma...@gmail.com>]
Sent: Wednesday, June 21, 2017 8:55 AM
To: Tauzell, Dave
Cc: users@kafka.apache.org<ma...@kafka.apache.org>; senthilec566@apache.org<ma...@apache.org>; dev@kafka.apache.org<ma...@kafka.apache.org>; Senthil kumar
Subject: Re: Handling 2 to 3 Million Events before Kafka

Thanks Jeyhun. Yes http server would be problematic here w.r.t network , memory ..

Hi Dave ,  The problem is not with Kafka , it's all about how do you handle huge data before kafka.  I did a simple test with 5 node Kafka Cluster which gives good result ( ~950 MB/s ) ..So Kafka side i dont see a scaling issue ...

All we are trying is before kafka how do we handle messages from different servers ...  Webservers can send fast to kafka but still i can handle only 50k events per second which is less for my use case.. also i can't deploy 20 webservers to handle this load. I'm looking for an option what could be the best candidate before kafka , it should be super fast in getting all and send it to kafka producer ..


--Senthil

On Wed, Jun 21, 2017 at 6:53 PM, Tauzell, Dave <Da...@surescripts.com>> wrote:
What are your configurations?

- production
- brokers
- consumers

Is the problem that web servers cannot send to Kafka fast enough or your consumers cannot process messages off of kafka fast enough?
What is the average size of these messages?

-Dave

-----Original Message-----
From: SenthilKumar K [mailto:senthilec566@gmail.com<ma...@gmail.com>]
Sent: Wednesday, June 21, 2017 7:58 AM
To: users@kafka.apache.org<ma...@kafka.apache.org>
Cc: senthilec566@apache.org<ma...@apache.org>; Senthil kumar; dev@kafka.apache.org<ma...@kafka.apache.org>
Subject: Handling 2 to 3 Million Events before Kafka

Hi Team ,   Sorry if this question is irrelevant to Kafka Group ...

I have been trying to solve problem of handling 5 GB/sec ingestion. Kafka is really good candidate for us to handle this ingestion rate ..


100K machines ----> { Http Server (Jetty/Netty) } --> Kafka Cluster..

I see the problem in Http Server where it can't handle beyond 50K events per instance ..  I'm thinking some other solution would be right choice before Kafka ..

Anyone worked on similar use case and similar load ? Suggestions/Thoughts ?

--Senthil
This e-mail and any files transmitted with it are confidential, may contain sensitive information, and are intended solely for the use of the individual or entity to whom they are addressed. If you have received this e-mail in error, please notify the sender by reply e-mail immediately and destroy all copies of the e-mail and any attachments.



RE: Handling 2 to 3 Million Events before Kafka

Posted by "Tauzell, Dave" <Da...@surescripts.com>.
I’m not really familiar with Netty so I won’t be of much help.   Maybe try posting on a Netty forum to see what they think?
-Dave

From: SenthilKumar K [mailto:senthilec566@gmail.com]
Sent: Wednesday, June 21, 2017 10:28 AM
To: Tauzell, Dave
Cc: users@kafka.apache.org; senthilec566@apache.org; dev@kafka.apache.org
Subject: Re: Handling 2 to 3 Million Events before Kafka

So netty would work for this case ?  I do have netty server and seems to be i'm not getting the expected results .. here is the git https://github.com/senthilec566/netty4-server , is this right implementation ?

Cheers,
Senthil

On Wed, Jun 21, 2017 at 7:45 PM, Tauzell, Dave <Da...@surescripts.com>> wrote:
I see.

1.       You don’t want the 100k machines sending directly to kafka.

2.       You can only have a small number of web servers

People certainly have web-servers handling over 100k concurrent connections.  See this for some examples:  https://github.com/smallnest/C1000K-Servers .

It seems possible with the right sort of kafka producer tuning.

-Dave

From: SenthilKumar K [mailto:senthilec566@gmail.com<ma...@gmail.com>]
Sent: Wednesday, June 21, 2017 8:55 AM
To: Tauzell, Dave
Cc: users@kafka.apache.org<ma...@kafka.apache.org>; senthilec566@apache.org<ma...@apache.org>; dev@kafka.apache.org<ma...@kafka.apache.org>; Senthil kumar
Subject: Re: Handling 2 to 3 Million Events before Kafka

Thanks Jeyhun. Yes http server would be problematic here w.r.t network , memory ..

Hi Dave ,  The problem is not with Kafka , it's all about how do you handle huge data before kafka.  I did a simple test with 5 node Kafka Cluster which gives good result ( ~950 MB/s ) ..So Kafka side i dont see a scaling issue ...

All we are trying is before kafka how do we handle messages from different servers ...  Webservers can send fast to kafka but still i can handle only 50k events per second which is less for my use case.. also i can't deploy 20 webservers to handle this load. I'm looking for an option what could be the best candidate before kafka , it should be super fast in getting all and send it to kafka producer ..


--Senthil

On Wed, Jun 21, 2017 at 6:53 PM, Tauzell, Dave <Da...@surescripts.com>> wrote:
What are your configurations?

- production
- brokers
- consumers

Is the problem that web servers cannot send to Kafka fast enough or your consumers cannot process messages off of kafka fast enough?
What is the average size of these messages?

-Dave

-----Original Message-----
From: SenthilKumar K [mailto:senthilec566@gmail.com<ma...@gmail.com>]
Sent: Wednesday, June 21, 2017 7:58 AM
To: users@kafka.apache.org<ma...@kafka.apache.org>
Cc: senthilec566@apache.org<ma...@apache.org>; Senthil kumar; dev@kafka.apache.org<ma...@kafka.apache.org>
Subject: Handling 2 to 3 Million Events before Kafka

Hi Team ,   Sorry if this question is irrelevant to Kafka Group ...

I have been trying to solve problem of handling 5 GB/sec ingestion. Kafka is really good candidate for us to handle this ingestion rate ..


100K machines ----> { Http Server (Jetty/Netty) } --> Kafka Cluster..

I see the problem in Http Server where it can't handle beyond 50K events per instance ..  I'm thinking some other solution would be right choice before Kafka ..

Anyone worked on similar use case and similar load ? Suggestions/Thoughts ?

--Senthil
This e-mail and any files transmitted with it are confidential, may contain sensitive information, and are intended solely for the use of the individual or entity to whom they are addressed. If you have received this e-mail in error, please notify the sender by reply e-mail immediately and destroy all copies of the e-mail and any attachments.



Re: Handling 2 to 3 Million Events before Kafka

Posted by SenthilKumar K <se...@gmail.com>.
So netty would work for this case ?  I do have netty server and seems to be
i'm not getting the expected results .. here is the git
https://github.com/senthilec566/netty4-server , is this right
implementation ?

Cheers,
Senthil

On Wed, Jun 21, 2017 at 7:45 PM, Tauzell, Dave <Dave.Tauzell@surescripts.com
> wrote:

> I see.
>
> 1.       You don’t want the 100k machines sending directly to kafka.
>
> 2.       You can only have a small number of web servers
>
>
>
> People certainly have web-servers handling over 100k concurrent
> connections.  See this for some examples:  https://github.com/smallnest/
> C1000K-Servers .
>
>
>
> It seems possible with the right sort of kafka producer tuning.
>
>
>
> -Dave
>
>
>
> *From:* SenthilKumar K [mailto:senthilec566@gmail.com]
> *Sent:* Wednesday, June 21, 2017 8:55 AM
> *To:* Tauzell, Dave
> *Cc:* users@kafka.apache.org; senthilec566@apache.org;
> dev@kafka.apache.org; Senthil kumar
> *Subject:* Re: Handling 2 to 3 Million Events before Kafka
>
>
>
> Thanks Jeyhun. Yes http server would be problematic here w.r.t network ,
> memory ..
>
>
>
> Hi Dave ,  The problem is not with Kafka , it's all about how do you
> handle huge data before kafka.  I did a simple test with 5 node Kafka
> Cluster which gives good result ( ~950 MB/s ) ..So Kafka side i dont see a
> scaling issue ...
>
>
>
> All we are trying is before kafka how do we handle messages from different
> servers ...  Webservers can send fast to kafka but still i can handle only
> 50k events per second which is less for my use case.. also i can't deploy
> 20 webservers to handle this load. I'm looking for an option what could be
> the best candidate before kafka , it should be super fast in getting all
> and send it to kafka producer ..
>
>
>
>
>
> --Senthil
>
>
>
> On Wed, Jun 21, 2017 at 6:53 PM, Tauzell, Dave <
> Dave.Tauzell@surescripts.com> wrote:
>
> What are your configurations?
>
> - production
> - brokers
> - consumers
>
> Is the problem that web servers cannot send to Kafka fast enough or your
> consumers cannot process messages off of kafka fast enough?
> What is the average size of these messages?
>
> -Dave
>
>
> -----Original Message-----
> From: SenthilKumar K [mailto:senthilec566@gmail.com]
> Sent: Wednesday, June 21, 2017 7:58 AM
> To: users@kafka.apache.org
> Cc: senthilec566@apache.org; Senthil kumar; dev@kafka.apache.org
> Subject: Handling 2 to 3 Million Events before Kafka
>
> Hi Team ,   Sorry if this question is irrelevant to Kafka Group ...
>
> I have been trying to solve problem of handling 5 GB/sec ingestion. Kafka
> is really good candidate for us to handle this ingestion rate ..
>
>
> 100K machines ----> { Http Server (Jetty/Netty) } --> Kafka Cluster..
>
> I see the problem in Http Server where it can't handle beyond 50K events
> per instance ..  I'm thinking some other solution would be right choice
> before Kafka ..
>
> Anyone worked on similar use case and similar load ? Suggestions/Thoughts ?
>
> --Senthil
>
> This e-mail and any files transmitted with it are confidential, may
> contain sensitive information, and are intended solely for the use of the
> individual or entity to whom they are addressed. If you have received this
> e-mail in error, please notify the sender by reply e-mail immediately and
> destroy all copies of the e-mail and any attachments.
>
>
>

Re: Handling 2 to 3 Million Events before Kafka

Posted by SenthilKumar K <se...@gmail.com>.
So netty would work for this case ?  I do have netty server and seems to be
i'm not getting the expected results .. here is the git
https://github.com/senthilec566/netty4-server , is this right
implementation ?

Cheers,
Senthil

On Wed, Jun 21, 2017 at 7:45 PM, Tauzell, Dave <Dave.Tauzell@surescripts.com
> wrote:

> I see.
>
> 1.       You don’t want the 100k machines sending directly to kafka.
>
> 2.       You can only have a small number of web servers
>
>
>
> People certainly have web-servers handling over 100k concurrent
> connections.  See this for some examples:  https://github.com/smallnest/
> C1000K-Servers .
>
>
>
> It seems possible with the right sort of kafka producer tuning.
>
>
>
> -Dave
>
>
>
> *From:* SenthilKumar K [mailto:senthilec566@gmail.com]
> *Sent:* Wednesday, June 21, 2017 8:55 AM
> *To:* Tauzell, Dave
> *Cc:* users@kafka.apache.org; senthilec566@apache.org;
> dev@kafka.apache.org; Senthil kumar
> *Subject:* Re: Handling 2 to 3 Million Events before Kafka
>
>
>
> Thanks Jeyhun. Yes http server would be problematic here w.r.t network ,
> memory ..
>
>
>
> Hi Dave ,  The problem is not with Kafka , it's all about how do you
> handle huge data before kafka.  I did a simple test with 5 node Kafka
> Cluster which gives good result ( ~950 MB/s ) ..So Kafka side i dont see a
> scaling issue ...
>
>
>
> All we are trying is before kafka how do we handle messages from different
> servers ...  Webservers can send fast to kafka but still i can handle only
> 50k events per second which is less for my use case.. also i can't deploy
> 20 webservers to handle this load. I'm looking for an option what could be
> the best candidate before kafka , it should be super fast in getting all
> and send it to kafka producer ..
>
>
>
>
>
> --Senthil
>
>
>
> On Wed, Jun 21, 2017 at 6:53 PM, Tauzell, Dave <
> Dave.Tauzell@surescripts.com> wrote:
>
> What are your configurations?
>
> - production
> - brokers
> - consumers
>
> Is the problem that web servers cannot send to Kafka fast enough or your
> consumers cannot process messages off of kafka fast enough?
> What is the average size of these messages?
>
> -Dave
>
>
> -----Original Message-----
> From: SenthilKumar K [mailto:senthilec566@gmail.com]
> Sent: Wednesday, June 21, 2017 7:58 AM
> To: users@kafka.apache.org
> Cc: senthilec566@apache.org; Senthil kumar; dev@kafka.apache.org
> Subject: Handling 2 to 3 Million Events before Kafka
>
> Hi Team ,   Sorry if this question is irrelevant to Kafka Group ...
>
> I have been trying to solve problem of handling 5 GB/sec ingestion. Kafka
> is really good candidate for us to handle this ingestion rate ..
>
>
> 100K machines ----> { Http Server (Jetty/Netty) } --> Kafka Cluster..
>
> I see the problem in Http Server where it can't handle beyond 50K events
> per instance ..  I'm thinking some other solution would be right choice
> before Kafka ..
>
> Anyone worked on similar use case and similar load ? Suggestions/Thoughts ?
>
> --Senthil
>
> This e-mail and any files transmitted with it are confidential, may
> contain sensitive information, and are intended solely for the use of the
> individual or entity to whom they are addressed. If you have received this
> e-mail in error, please notify the sender by reply e-mail immediately and
> destroy all copies of the e-mail and any attachments.
>
>
>

RE: Handling 2 to 3 Million Events before Kafka

Posted by "Tauzell, Dave" <Da...@surescripts.com>.
I see.

1.       You don’t want the 100k machines sending directly to kafka.

2.       You can only have a small number of web servers

People certainly have web-servers handling over 100k concurrent connections.  See this for some examples:  https://github.com/smallnest/C1000K-Servers .

It seems possible with the right sort of kafka producer tuning.

-Dave

From: SenthilKumar K [mailto:senthilec566@gmail.com]
Sent: Wednesday, June 21, 2017 8:55 AM
To: Tauzell, Dave
Cc: users@kafka.apache.org; senthilec566@apache.org; dev@kafka.apache.org; Senthil kumar
Subject: Re: Handling 2 to 3 Million Events before Kafka

Thanks Jeyhun. Yes http server would be problematic here w.r.t network , memory ..

Hi Dave ,  The problem is not with Kafka , it's all about how do you handle huge data before kafka.  I did a simple test with 5 node Kafka Cluster which gives good result ( ~950 MB/s ) ..So Kafka side i dont see a scaling issue ...

All we are trying is before kafka how do we handle messages from different servers ...  Webservers can send fast to kafka but still i can handle only 50k events per second which is less for my use case.. also i can't deploy 20 webservers to handle this load. I'm looking for an option what could be the best candidate before kafka , it should be super fast in getting all and send it to kafka producer ..


--Senthil

On Wed, Jun 21, 2017 at 6:53 PM, Tauzell, Dave <Da...@surescripts.com>> wrote:
What are your configurations?

- production
- brokers
- consumers

Is the problem that web servers cannot send to Kafka fast enough or your consumers cannot process messages off of kafka fast enough?
What is the average size of these messages?

-Dave

-----Original Message-----
From: SenthilKumar K [mailto:senthilec566@gmail.com<ma...@gmail.com>]
Sent: Wednesday, June 21, 2017 7:58 AM
To: users@kafka.apache.org<ma...@kafka.apache.org>
Cc: senthilec566@apache.org<ma...@apache.org>; Senthil kumar; dev@kafka.apache.org<ma...@kafka.apache.org>
Subject: Handling 2 to 3 Million Events before Kafka

Hi Team ,   Sorry if this question is irrelevant to Kafka Group ...

I have been trying to solve problem of handling 5 GB/sec ingestion. Kafka is really good candidate for us to handle this ingestion rate ..


100K machines ----> { Http Server (Jetty/Netty) } --> Kafka Cluster..

I see the problem in Http Server where it can't handle beyond 50K events per instance ..  I'm thinking some other solution would be right choice before Kafka ..

Anyone worked on similar use case and similar load ? Suggestions/Thoughts ?

--Senthil
This e-mail and any files transmitted with it are confidential, may contain sensitive information, and are intended solely for the use of the individual or entity to whom they are addressed. If you have received this e-mail in error, please notify the sender by reply e-mail immediately and destroy all copies of the e-mail and any attachments.


RE: Handling 2 to 3 Million Events before Kafka

Posted by "Tauzell, Dave" <Da...@surescripts.com>.
I see.

1.       You don’t want the 100k machines sending directly to kafka.

2.       You can only have a small number of web servers

People certainly have web-servers handling over 100k concurrent connections.  See this for some examples:  https://github.com/smallnest/C1000K-Servers .

It seems possible with the right sort of kafka producer tuning.

-Dave

From: SenthilKumar K [mailto:senthilec566@gmail.com]
Sent: Wednesday, June 21, 2017 8:55 AM
To: Tauzell, Dave
Cc: users@kafka.apache.org; senthilec566@apache.org; dev@kafka.apache.org; Senthil kumar
Subject: Re: Handling 2 to 3 Million Events before Kafka

Thanks Jeyhun. Yes http server would be problematic here w.r.t network , memory ..

Hi Dave ,  The problem is not with Kafka , it's all about how do you handle huge data before kafka.  I did a simple test with 5 node Kafka Cluster which gives good result ( ~950 MB/s ) ..So Kafka side i dont see a scaling issue ...

All we are trying is before kafka how do we handle messages from different servers ...  Webservers can send fast to kafka but still i can handle only 50k events per second which is less for my use case.. also i can't deploy 20 webservers to handle this load. I'm looking for an option what could be the best candidate before kafka , it should be super fast in getting all and send it to kafka producer ..


--Senthil

On Wed, Jun 21, 2017 at 6:53 PM, Tauzell, Dave <Da...@surescripts.com>> wrote:
What are your configurations?

- production
- brokers
- consumers

Is the problem that web servers cannot send to Kafka fast enough or your consumers cannot process messages off of kafka fast enough?
What is the average size of these messages?

-Dave

-----Original Message-----
From: SenthilKumar K [mailto:senthilec566@gmail.com<ma...@gmail.com>]
Sent: Wednesday, June 21, 2017 7:58 AM
To: users@kafka.apache.org<ma...@kafka.apache.org>
Cc: senthilec566@apache.org<ma...@apache.org>; Senthil kumar; dev@kafka.apache.org<ma...@kafka.apache.org>
Subject: Handling 2 to 3 Million Events before Kafka

Hi Team ,   Sorry if this question is irrelevant to Kafka Group ...

I have been trying to solve problem of handling 5 GB/sec ingestion. Kafka is really good candidate for us to handle this ingestion rate ..


100K machines ----> { Http Server (Jetty/Netty) } --> Kafka Cluster..

I see the problem in Http Server where it can't handle beyond 50K events per instance ..  I'm thinking some other solution would be right choice before Kafka ..

Anyone worked on similar use case and similar load ? Suggestions/Thoughts ?

--Senthil
This e-mail and any files transmitted with it are confidential, may contain sensitive information, and are intended solely for the use of the individual or entity to whom they are addressed. If you have received this e-mail in error, please notify the sender by reply e-mail immediately and destroy all copies of the e-mail and any attachments.


Re: Handling 2 to 3 Million Events before Kafka

Posted by SenthilKumar K <se...@gmail.com>.
Thanks Jeyhun. Yes http server would be problematic here w.r.t network ,
memory ..

Hi Dave ,  The problem is not with Kafka , it's all about how do you handle
huge data before kafka.  I did a simple test with 5 node Kafka Cluster
which gives good result ( ~950 MB/s ) ..So Kafka side i dont see a scaling
issue ...

All we are trying is before kafka how do we handle messages from different
servers ...  Webservers can send fast to kafka but still i can handle only
50k events per second which is less for my use case.. also i can't deploy
20 webservers to handle this load. I'm looking for an option what could be
the best candidate before kafka , it should be super fast in getting all
and send it to kafka producer ..


--Senthil

On Wed, Jun 21, 2017 at 6:53 PM, Tauzell, Dave <Dave.Tauzell@surescripts.com
> wrote:

> What are your configurations?
>
> - production
> - brokers
> - consumers
>
> Is the problem that web servers cannot send to Kafka fast enough or your
> consumers cannot process messages off of kafka fast enough?
> What is the average size of these messages?
>
> -Dave
>
> -----Original Message-----
> From: SenthilKumar K [mailto:senthilec566@gmail.com]
> Sent: Wednesday, June 21, 2017 7:58 AM
> To: users@kafka.apache.org
> Cc: senthilec566@apache.org; Senthil kumar; dev@kafka.apache.org
> Subject: Handling 2 to 3 Million Events before Kafka
>
> Hi Team ,   Sorry if this question is irrelevant to Kafka Group ...
>
> I have been trying to solve problem of handling 5 GB/sec ingestion. Kafka
> is really good candidate for us to handle this ingestion rate ..
>
>
> 100K machines ----> { Http Server (Jetty/Netty) } --> Kafka Cluster..
>
> I see the problem in Http Server where it can't handle beyond 50K events
> per instance ..  I'm thinking some other solution would be right choice
> before Kafka ..
>
> Anyone worked on similar use case and similar load ? Suggestions/Thoughts ?
>
> --Senthil
> This e-mail and any files transmitted with it are confidential, may
> contain sensitive information, and are intended solely for the use of the
> individual or entity to whom they are addressed. If you have received this
> e-mail in error, please notify the sender by reply e-mail immediately and
> destroy all copies of the e-mail and any attachments.
>

Re: Handling 2 to 3 Million Events before Kafka

Posted by SenthilKumar K <se...@gmail.com>.
Thanks Jeyhun. Yes http server would be problematic here w.r.t network ,
memory ..

Hi Dave ,  The problem is not with Kafka , it's all about how do you handle
huge data before kafka.  I did a simple test with 5 node Kafka Cluster
which gives good result ( ~950 MB/s ) ..So Kafka side i dont see a scaling
issue ...

All we are trying is before kafka how do we handle messages from different
servers ...  Webservers can send fast to kafka but still i can handle only
50k events per second which is less for my use case.. also i can't deploy
20 webservers to handle this load. I'm looking for an option what could be
the best candidate before kafka , it should be super fast in getting all
and send it to kafka producer ..


--Senthil

On Wed, Jun 21, 2017 at 6:53 PM, Tauzell, Dave <Dave.Tauzell@surescripts.com
> wrote:

> What are your configurations?
>
> - production
> - brokers
> - consumers
>
> Is the problem that web servers cannot send to Kafka fast enough or your
> consumers cannot process messages off of kafka fast enough?
> What is the average size of these messages?
>
> -Dave
>
> -----Original Message-----
> From: SenthilKumar K [mailto:senthilec566@gmail.com]
> Sent: Wednesday, June 21, 2017 7:58 AM
> To: users@kafka.apache.org
> Cc: senthilec566@apache.org; Senthil kumar; dev@kafka.apache.org
> Subject: Handling 2 to 3 Million Events before Kafka
>
> Hi Team ,   Sorry if this question is irrelevant to Kafka Group ...
>
> I have been trying to solve problem of handling 5 GB/sec ingestion. Kafka
> is really good candidate for us to handle this ingestion rate ..
>
>
> 100K machines ----> { Http Server (Jetty/Netty) } --> Kafka Cluster..
>
> I see the problem in Http Server where it can't handle beyond 50K events
> per instance ..  I'm thinking some other solution would be right choice
> before Kafka ..
>
> Anyone worked on similar use case and similar load ? Suggestions/Thoughts ?
>
> --Senthil
> This e-mail and any files transmitted with it are confidential, may
> contain sensitive information, and are intended solely for the use of the
> individual or entity to whom they are addressed. If you have received this
> e-mail in error, please notify the sender by reply e-mail immediately and
> destroy all copies of the e-mail and any attachments.
>

RE: Handling 2 to 3 Million Events before Kafka

Posted by "Tauzell, Dave" <Da...@surescripts.com>.
What are your configurations?

- production
- brokers
- consumers

Is the problem that web servers cannot send to Kafka fast enough or your consumers cannot process messages off of kafka fast enough?
What is the average size of these messages?

-Dave

-----Original Message-----
From: SenthilKumar K [mailto:senthilec566@gmail.com]
Sent: Wednesday, June 21, 2017 7:58 AM
To: users@kafka.apache.org
Cc: senthilec566@apache.org; Senthil kumar; dev@kafka.apache.org
Subject: Handling 2 to 3 Million Events before Kafka

Hi Team ,   Sorry if this question is irrelevant to Kafka Group ...

I have been trying to solve problem of handling 5 GB/sec ingestion. Kafka is really good candidate for us to handle this ingestion rate ..


100K machines ----> { Http Server (Jetty/Netty) } --> Kafka Cluster..

I see the problem in Http Server where it can't handle beyond 50K events per instance ..  I'm thinking some other solution would be right choice before Kafka ..

Anyone worked on similar use case and similar load ? Suggestions/Thoughts ?

--Senthil
This e-mail and any files transmitted with it are confidential, may contain sensitive information, and are intended solely for the use of the individual or entity to whom they are addressed. If you have received this e-mail in error, please notify the sender by reply e-mail immediately and destroy all copies of the e-mail and any attachments.

RE: Handling 2 to 3 Million Events before Kafka

Posted by "Tauzell, Dave" <Da...@surescripts.com>.
What are your configurations?

- production
- brokers
- consumers

Is the problem that web servers cannot send to Kafka fast enough or your consumers cannot process messages off of kafka fast enough?
What is the average size of these messages?

-Dave

-----Original Message-----
From: SenthilKumar K [mailto:senthilec566@gmail.com]
Sent: Wednesday, June 21, 2017 7:58 AM
To: users@kafka.apache.org
Cc: senthilec566@apache.org; Senthil kumar; dev@kafka.apache.org
Subject: Handling 2 to 3 Million Events before Kafka

Hi Team ,   Sorry if this question is irrelevant to Kafka Group ...

I have been trying to solve problem of handling 5 GB/sec ingestion. Kafka is really good candidate for us to handle this ingestion rate ..


100K machines ----> { Http Server (Jetty/Netty) } --> Kafka Cluster..

I see the problem in Http Server where it can't handle beyond 50K events per instance ..  I'm thinking some other solution would be right choice before Kafka ..

Anyone worked on similar use case and similar load ? Suggestions/Thoughts ?

--Senthil
This e-mail and any files transmitted with it are confidential, may contain sensitive information, and are intended solely for the use of the individual or entity to whom they are addressed. If you have received this e-mail in error, please notify the sender by reply e-mail immediately and destroy all copies of the e-mail and any attachments.