You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@kafka.apache.org by ij...@apache.org on 2016/05/10 07:01:06 UTC

kafka git commit: MINOR: Add Kafka Streams API / upgrade notes

Repository: kafka
Updated Branches:
  refs/heads/trunk 9575e9307 -> 6f1873242


MINOR: Add Kafka Streams API / upgrade notes

Author: Guozhang Wang <wa...@gmail.com>

Reviewers: Michael G. Noll <mi...@confluent.io>, Ismael Juma <is...@juma.me.uk>

Closes #1321 from guozhangwang/KStreamsJavaDoc


Project: http://git-wip-us.apache.org/repos/asf/kafka/repo
Commit: http://git-wip-us.apache.org/repos/asf/kafka/commit/6f187324
Tree: http://git-wip-us.apache.org/repos/asf/kafka/tree/6f187324
Diff: http://git-wip-us.apache.org/repos/asf/kafka/diff/6f187324

Branch: refs/heads/trunk
Commit: 6f1873242c1a189770319e09f53467d26584112f
Parents: 9575e93
Author: Guozhang Wang <wa...@gmail.com>
Authored: Tue May 10 08:00:51 2016 +0100
Committer: Ismael Juma <is...@juma.me.uk>
Committed: Tue May 10 08:00:51 2016 +0100

----------------------------------------------------------------------
 docs/api.html           | 19 +++++++++++++++++++
 docs/documentation.html |  1 +
 docs/quickstart.html    | 18 +++++++++---------
 docs/upgrade.html       |  1 +
 docs/uses.html          |  2 +-
 5 files changed, 31 insertions(+), 10 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/kafka/blob/6f187324/docs/api.html
----------------------------------------------------------------------
diff --git a/docs/api.html b/docs/api.html
index 8d5be9b..c457241 100644
--- a/docs/api.html
+++ b/docs/api.html
@@ -165,3 +165,22 @@ This new unified consumer API removes the distinction between the 0.8 high-level
 
 Examples showing how to use the consumer are given in the
 <a href="http://kafka.apache.org/0100/javadoc/index.html?org/apache/kafka/clients/consumer/KafkaConsumer.html" title="Kafka 0.9.0 Javadoc">javadocs</a>.
+
+<h3><a id="streamsapi" href="#streamsapi">2.3 Streams API</a></h3>
+
+As of the 0.10.0 release we have added a new client library named <b>Kafka Streams</b> to let users implement their stream processing
+applications with data stored in Kafka topics. Kafka Streams is considered alpha quality and its public APIs are likely to change in
+future releases.
+You can use Kafka Streams by adding a dependency on the streams jar using
+the following example maven co-ordinates (you can change the version numbers with new releases):
+
+<pre>
+	&lt;dependency&gt;
+	    &lt;groupId&gt;org.apache.kafka&lt;/groupId&gt;
+	    &lt;artifactId&gt;kafka-streams&lt;/artifactId&gt;
+	    &lt;version&gt;0.10.0.0&lt;/version&gt;
+	&lt;/dependency&gt;
+</pre>
+
+Examples showing how to use this library are given in the
+<a href="http://kafka.apache.org/0100/javadoc/index.html?org/apache/kafka/streams/KafkaStreams.html" title="Kafka 0.10.0 Javadoc">javadocs</a> (note those classes annotated with <b>@InterfaceStability.Unstable</b>, indicating their public APIs may change without backward-compatibility in future releases).
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/kafka/blob/6f187324/docs/documentation.html
----------------------------------------------------------------------
diff --git a/docs/documentation.html b/docs/documentation.html
index 70002ab..ddc3102 100644
--- a/docs/documentation.html
+++ b/docs/documentation.html
@@ -40,6 +40,7 @@ Prior releases: <a href="/07/documentation.html">0.7.x</a>, <a href="/08/documen
                       <li><a href="#simpleconsumerapi">2.2.2 Old Simple Consumer API</a>
                       <li><a href="#newconsumerapi">2.2.3 New Consumer API</a>
                   </ul>
+              <li><a href="#streamsapi">2.3 Streams API</a>
           </ul>
     </li>
     <li><a href="#configuration">3. Configuration</a>

http://git-wip-us.apache.org/repos/asf/kafka/blob/6f187324/docs/quickstart.html
----------------------------------------------------------------------
diff --git a/docs/quickstart.html b/docs/quickstart.html
index 7a923c6..4d4f7ea 100644
--- a/docs/quickstart.html
+++ b/docs/quickstart.html
@@ -258,15 +258,15 @@ This quickstart example will demonstrate how to run a streaming application code
 of the <code>WordCountDemo</code> example code (converted to use Java 8 lambda expressions for easy reading).
 </p>
 <pre>
-KStream<String, Long> wordCounts = textLines
-// Split each text line, by whitespace, into words.
-.flatMapValues(value -> Arrays.asList(value.toLowerCase().split("\\W+")))
-// Ensure the words are available as message keys for the next aggregate operation.
-.map((key, value) -> new KeyValue<>(value, value))
-// Count the occurrences of each word (message key).
-.countByKey(stringSerializer, longSerializer, stringDeserializer, longDeserializer, "Counts")
-// Convert the resulted aggregate table into another stream.
-.toStream();
+KTable<String, Long> wordCounts = textLines
+    // Split each text line, by whitespace, into words.
+    .flatMapValues(value -> Arrays.asList(value.toLowerCase().split("\\W+")))
+
+    // Ensure the words are available as record keys for the next aggregate operation.
+    .map((key, value) -> new KeyValue<>(value, value))
+
+    // Count the occurrences of each word (record key) and store the results into a table named "Counts".
+    .countByKey("Counts")
 </pre>
 
 <p>

http://git-wip-us.apache.org/repos/asf/kafka/blob/6f187324/docs/upgrade.html
----------------------------------------------------------------------
diff --git a/docs/upgrade.html b/docs/upgrade.html
index 486954c..4b8ec7e 100644
--- a/docs/upgrade.html
+++ b/docs/upgrade.html
@@ -90,6 +90,7 @@ work with 0.10.0.x brokers. Therefore, 0.9.0.0 clients should be upgraded to 0.9
 <h5><a id="upgrade_10_notable" href="#upgrade_10_notable">Notable changes in 0.10.0.0</a></h5>
 
 <ul>
+    <li> Starting from Kafka 0.10.0.0, a new client library named <b>Kafka Streams</b> is available for stream processing on data stored in Kafka topics. This new client library only works with 0.10.x and upward versioned brokers due to message format changes mentioned above. For more information please read <a href="#streams_overview">this section</a>.</li>
     <li> The default value of the configuration parameter <code>receive.buffer.bytes</code> is now 64K for the new consumer.</li>
     <li> The new consumer now exposes the configuration parameter <code>exclude.internal.topics</code> to restrict internal topics (such as the consumer offsets topic) from accidentally being included in regular expression subscriptions. By default, it is enabled.</li>
     <li> The old Scala producer has been deprecated. Users should migrate their code to the Java producer included in the kafka-clients JAR as soon as possible. </li>

http://git-wip-us.apache.org/repos/asf/kafka/blob/6f187324/docs/uses.html
----------------------------------------------------------------------
diff --git a/docs/uses.html b/docs/uses.html
index f769bed..5b97272 100644
--- a/docs/uses.html
+++ b/docs/uses.html
@@ -45,7 +45,7 @@ In comparison to log-centric systems like Scribe or Flume, Kafka offers equally
 
 <h4><a id="uses_streamprocessing" href="#uses_streamprocessing">Stream Processing</a></h4>
 
-Many users end up doing stage-wise processing of data where data is consumed from topics of raw data and then aggregated, enriched, or otherwise transformed into new Kafka topics for further consumption. For example a processing flow for article recommendation might crawl article content from RSS feeds and publish it to an "articles" topic; further processing might help normalize or deduplicate this content to a topic of cleaned article content; a final stage might attempt to match this content to users. This creates a graph of real-time data flow out of the individual topics. <a href="https://storm.apache.org/">Storm</a> and <a href="http://samza.apache.org/">Samza</a> are popular frameworks for implementing these kinds of transformations.
+Many users of Kafka process data in processing pipelines consisting of multiple stages, where raw input data is consumed from Kafka topics and then aggregated, enriched, or otherwise transformed into new topics for further consumption or follow-up processing. For example, a processing pipeline for recommending news articles might crawl article content from RSS feeds and publish it to an "articles" topic; further processing might normalize or deduplicate this content and published the cleansed article content to a new topic; a final processing stage might attempt to recommend this content to users. Such processing pipelines create graphs of real-time data flows based on the individual topics. Starting in 0.10.0.0, a light-weight but powerful stream processing library called <a href="#streams_overview">Kafka Streams</a> is available in Apache Kafka to perform such data processing as described above. Apart from Kafka Streams, alternative open source stream processing tools include <a h
 ref="https://storm.apache.org/">Apache Storm</a> and <a href="http://samza.apache.org/">Apache Samza</a>.
 
 <h4><a id="uses_eventsourcing" href="#uses_eventsourcing">Event Sourcing</a></h4>