You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@flume.apache.org by hs...@apache.org on 2014/09/24 07:36:57 UTC
git commit: FLUME-2455. Kafka Sink Documentation.
Repository: flume
Updated Branches:
refs/heads/trunk 0bbd0ad7a -> 186a3b808
FLUME-2455. Kafka Sink Documentation.
(Thilina Buddika, Gwen Shapira via Hari)
Project: http://git-wip-us.apache.org/repos/asf/flume/repo
Commit: http://git-wip-us.apache.org/repos/asf/flume/commit/186a3b80
Tree: http://git-wip-us.apache.org/repos/asf/flume/tree/186a3b80
Diff: http://git-wip-us.apache.org/repos/asf/flume/diff/186a3b80
Branch: refs/heads/trunk
Commit: 186a3b808b371cbf2c4b5a47325caa324c7c7a6e
Parents: 0bbd0ad
Author: Hari Shreedharan <hs...@apache.org>
Authored: Tue Sep 23 22:33:55 2014 -0700
Committer: Hari Shreedharan <hs...@apache.org>
Committed: Tue Sep 23 22:33:55 2014 -0700
----------------------------------------------------------------------
flume-ng-doc/sphinx/FlumeUserGuide.rst | 54 +++++++++++++++++++++++++++++
1 file changed, 54 insertions(+)
----------------------------------------------------------------------
http://git-wip-us.apache.org/repos/asf/flume/blob/186a3b80/flume-ng-doc/sphinx/FlumeUserGuide.rst
----------------------------------------------------------------------
diff --git a/flume-ng-doc/sphinx/FlumeUserGuide.rst b/flume-ng-doc/sphinx/FlumeUserGuide.rst
index 3a47560..11c1ad7 100644
--- a/flume-ng-doc/sphinx/FlumeUserGuide.rst
+++ b/flume-ng-doc/sphinx/FlumeUserGuide.rst
@@ -2137,6 +2137,60 @@ auth.proxyUser -- The effective user for HDFS actions, if differ
the kerberos principal
======================= ======= ===========================================================
+
+Kafka Sink
+~~~~~~~~~~
+This is a Flume Sink implementation that can publish data to a
+`Kafka <http://kafka.apache.org/>`_ topic. One of the objective is to integrate Flume
+with Kafka so that pull based processing systems can process the data coming
+through various Flume sources. This currently supports Kafka 0.8.x series of releases.
+
+Required properties are marked in bold font.
+
+
+=============================== =================== =============================================================================================
+Property Name Default Description
+=============================== =================== =============================================================================================
+**type** -- Must be set to ``org.apache.flume.sink.kafka.KafkaSink``
+**kafka.metadata.broker.list** -- List of brokers Kafka-Sink will connect to, to get the list of topic partitions
+ This can be a partial list of brokers, but we recommend at least two for HA.
+ The format is comma separated list of hostname:port
+topic default-flume-topic The topic in Kafka to which the messages will be published. If this parameter is configured,
+ messages will be published to this topic.
+ If the event header contains a "topic" field, the event will be published to that topic
+ overriding the topic configured here.
+batchSize 100 How many messages to process in one batch. Larger batches improve throughput while adding latency.
+kafka.request.required.acks 0 How many replicas must acknowledge a message before its considered successfully written.
+ Accepted values are 0 (Never wait for acknowledgement), 1 (wait for leader only), -1 (wait for all replicas)
+ The default is the fastest option, but we *highly recommend* setting this to -1 to avoid data loss
+kafka.producer.type sync Whether messages should be sent to broker synchronously or using an asynchronous background thread.
+ Accepted values are sync (safest) and async (faster but potentially unsafe)
+Other Kafka Producer Properties -- These properties are used to configure the Kafka Producer. Any producer property supported
+ by Kafka can be used. The only requirement is to prepend the property name with the prefix ``kafka.``.
+=============================== =================== =============================================================================================
+
+.. note:: Kafka Sink uses the ``topic`` and ``key`` properties from the FlumeEvent headers to send events to Kafka.
+ If ``topic`` exists in the headers, the event will be sent to that specific topic, overriding the topic configured for the Sink.
+ If ``key`` exists in the headers, the key will used by Kafka to partition the data between the topic partitions. Events with same key
+ will be sent to the same partition. If the key is null, events will be sent to random partitions.
+
+An example configuration of a Kafka sink is given below. Properties starting
+with the prefix ``kafka`` (the last 3 properties) are used when instantiating
+the Kafka producer. The properties that are passed when creating the Kafka
+producer are not limited to the properties given in this example.
+Also it's possible include your custom properties here and access them inside
+the preprocessor through the Flume Context object passed in as a method
+argument.
+
+.. code-block:: properties
+
+ a1.sinks.k1.type = org.apache.flume.sink.kafka.KafkaSink
+ a1.sinks.k1.topic = mytopic
+ a1.sinks.k1.kafka.metadata.broker.list = localhost:9092
+ a1.sinks.k1.kafka.request.required.acks = 1
+ a1.sinks.k1.batchSize = 20
+ a1.sinks.k1.channel = c1
+
Custom Sink
~~~~~~~~~~~