You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@storm.apache.org by Jon Logan <jm...@buffalo.edu> on 2014/04/04 03:44:24 UTC

Replacing DRPC With Kafka

Has anyone attempted to replace Storm DRPC with Kafka?  My main concern
stems from the weight of Kafka topics...especially for the handling of
return results to clients.

Re: Replacing DRPC With Kafka

Posted by Jason Jackson <ja...@gmail.com>.
Never tried this. The DRPC Server is horizontally scalable, you can launch
as many as you want, and spread your clients connections across all the
servers.

I think the idea with DRPC is that it's suppose to be very low latency
between request&reply e.g. if it's used to power a website. Kafka will add
at least 1 second latency afaik with it's default settings as I believe you
can't consume from Kafka until the producer produces and there's an fsync.
There might be way to disable this in kafka (while sacrificing durability).


On Thu, Apr 3, 2014 at 6:44 PM, Jon Logan <jm...@buffalo.edu> wrote:

> Has anyone attempted to replace Storm DRPC with Kafka?  My main concern
> stems from the weight of Kafka topics...especially for the handling of
> return results to clients.
>

RE: Replacing DRPC With Kafka

Posted by Simon Cooper <si...@featurespace.co.uk>.
We use kafka for inter-topology communication. Going from input -> kafka -> storm -> kafka -> output takes around 20-40ms. Although you need to use kafka 0.8 or greater, as previous versions force an fsync on every message, which makes performance plummet.

The main reasons we went for kafka rather than DRPC was that kafka gives us a permanent log of all the messages in and out of the system, and that messages won't be lost on a server failure (whereas they might be with DRPC, as that just uses an in-memory queue)

SimonC

From: Jon Logan [mailto:jmlogan@buffalo.edu]
Sent: 04 April 2014 02:44
To: user@storm.incubator.apache.org
Subject: Replacing DRPC With Kafka

Has anyone attempted to replace Storm DRPC with Kafka?  My main concern stems from the weight of Kafka topics...especially for the handling of return results to clients.