You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@kafka.apache.org by jk...@apache.org on 2014/03/09 19:32:50 UTC

svn commit: r1575736 - in /kafka/site/081: configuration.html design.html documentation.html ecosystem.html implementation.html introduction.html migration.html ops.html quickstart.html tools.html upgrade.html

Author: jkreps
Date: Sun Mar  9 18:32:49 2014
New Revision: 1575736

URL: http://svn.apache.org/r1575736
Log:
Misc. improvements from Michael Noll.


Added:
    kafka/site/081/ecosystem.html
Removed:
    kafka/site/081/tools.html
Modified:
    kafka/site/081/configuration.html
    kafka/site/081/design.html
    kafka/site/081/documentation.html
    kafka/site/081/implementation.html
    kafka/site/081/introduction.html
    kafka/site/081/migration.html
    kafka/site/081/ops.html
    kafka/site/081/quickstart.html
    kafka/site/081/upgrade.html

Modified: kafka/site/081/configuration.html
URL: http://svn.apache.org/viewvc/kafka/site/081/configuration.html?rev=1575736&r1=1575735&r2=1575736&view=diff
==============================================================================
--- kafka/site/081/configuration.html (original)
+++ kafka/site/081/configuration.html Sun Mar  9 18:32:49 2014
@@ -38,7 +38,7 @@ Topic-level configurations and defaults 
       <td>null</td>
       <td>Specifies the zookeeper connection string in the form <code>hostname:port</code>, where hostname and port are the host and port for a node in your zookeeper cluster. To allow connecting through other zookeeper nodes when that host is down you can also specify multiple hosts in the form <code>hostname1:port1,hostname2:port2,hostname3:port3</code>.
 	<p>
-Zookeeper also allows you to add a "chroot" path which will make all kafka data for this cluster appear under a particular path. This is a way to setup multiple Kafka clusters or other applications on the same zookeeper cluster. To do this give a connection string in the form <code>hostname1:port1,hostname2:port2,hostname3:port3/chroot/path</code> which would put all this cluster's data under the path <code>/chroot/path</code>. Note that you must create this path yourself prior to starting the broker and consumers must use the same connection string.</td>
+ZooKeeper also allows you to add a "chroot" path which will make all kafka data for this cluster appear under a particular path. This is a way to setup multiple Kafka clusters or other applications on the same zookeeper cluster. To do this give a connection string in the form <code>hostname1:port1,hostname2:port2,hostname3:port3/chroot/path</code> which would put all this cluster's data under the path <code>/chroot/path</code>. Note that you must create this path yourself prior to starting the broker and consumers must use the same connection string.</td>
     </tr>
     <tr>
       <td>message.max.bytes</td>
@@ -296,7 +296,7 @@ Zookeeper also allows you to add a "chro
     <tr>
       <td>zookeeper.session.timeout.ms</td>
       <td>6000</td>
-      <td>Zookeeper session timeout. If the server fails to heartbeat to zookeeper within this period of time it is considered dead. If you set this too low the server may be falsely considered dead; if you set it too high it may take too long to recognize a truly dead server.</td>
+      <td>ZooKeeper session timeout. If the server fails to heartbeat to zookeeper within this period of time it is considered dead. If you set this too low the server may be falsely considered dead; if you set it too high it may take too long to recognize a truly dead server.</td>
     </tr>
     <tr>
       <td>zookeeper.connection.timeout.ms</td>
@@ -541,7 +541,7 @@ The essential consumer configurations ar
       <td>auto.offset.reset</td>
       <td colspan="1">largest</td>
       <td>
-        <p>What to do when there is no initial offset in Zookeeper or if an offset is out of range:<br/>* smallest : automatically reset the offset to the smallest offset<br/>* largest : automatically reset the offset to the largest offset<br/>* anything else: throw exception to the consumer</p>
+        <p>What to do when there is no initial offset in ZooKeeper or if an offset is out of range:<br/>* smallest : automatically reset the offset to the smallest offset<br/>* largest : automatically reset the offset to the largest offset<br/>* anything else: throw exception to the consumer</p>
      </td>
     </tr>
     <tr>
@@ -557,7 +557,7 @@ The essential consumer configurations ar
     <tr>
       <td>zookeeper.session.timeout.ms </td>
       <td colspan="1">6000</td>
-      <td>Zookeeper session timeout. If the consumer fails to heartbeat to zookeeper for this period of time it is considered dead and a rebalance will occur.</td>
+      <td>ZooKeeper session timeout. If the consumer fails to heartbeat to zookeeper for this period of time it is considered dead and a rebalance will occur.</td>
     </tr>
     <tr>
       <td>zookeeper.connection.timeout.ms</td>

Modified: kafka/site/081/design.html
URL: http://svn.apache.org/viewvc/kafka/site/081/design.html?rev=1575736&r1=1575735&r2=1575736&view=diff
==============================================================================
--- kafka/site/081/design.html (original)
+++ kafka/site/081/design.html Sun Mar  9 18:32:49 2014
@@ -25,8 +25,8 @@ To compensate for this performance diver
 <p>
 Furthermore we are building on top of the JVM, and anyone who has spent any time with Java memory usage knows two things:
 <ol>
-	<li>The memory overhead of objects is very high, often doubling the size of the data stored (or worse).</li>
-	<li>Java garbage collection becomes increasingly fiddly and slow as the in-heap data increases.</li>
+    <li>The memory overhead of objects is very high, often doubling the size of the data stored (or worse).</li>
+    <li>Java garbage collection becomes increasingly fiddly and slow as the in-heap data increases.</li>
 </ol>
 <p>
 As a result of these factors using the filesystem and relying on pagecache is superior to maintaining an in-memory cache or other structure&mdash;we at least double the available cache by having automatic access to all free memory, and likely double again by storing a compact byte structure rather than individual objects. Doing so will result in a cache of up to 28-30GB on a 32GB machine without GC penalties. Furthermore this cache will stay warm even if the service is restarted, whereas the in-process cache will need to be rebuilt in memory (which for a 10GB cache may take 10 minutes) or else it will need to start with a completely cold cache (which likely means terrible initial performance). This also greatly simplifies the code as all logic for maintaining coherency between the cache and filesystem is now in the OS, which tends to do so more efficiently and more correctly than one-off in-process attempts. If your disk usage favors linear reads then read-ahead is effectively pre-p
 opulating this cache with useful data on each disk read.
@@ -63,10 +63,10 @@ The message log maintained by the broker
 <p>
 To understand the impact of sendfile, it is important to understand the common data path for transfer of data from file to socket:
 <ol>
-	<li>The operating system reads data from the disk into pagecache in kernel space</li>
-	<li>The application reads the data from kernel space into a user-space buffer</li>
-	<li>The application writes the data back into kernel space into a socket buffer</li>
-	<li>The operating system copies the data from the socket buffer to the NIC buffer where it is sent over the network</li>
+    <li>The operating system reads data from the disk into pagecache in kernel space</li>
+    <li>The application reads the data from kernel space into a user-space buffer</li>
+    <li>The application writes the data back into kernel space into a socket buffer</li>
+    <li>The operating system copies the data from the socket buffer to the NIC buffer where it is sent over the network</li>
 </ol>
 <p>
 This is clearly inefficient, there are four copies and two system calls. Using sendfile, this re-copying is avoided by allowing the OS to send the data from pagecache to the network directly. So in this optimized path, only the final copy to the NIC buffer is needed.
@@ -96,14 +96,16 @@ The client controls which partition it p
 <h4>Asynchronous send</h4>
 <p>
 Batching is one of the big drivers of efficiency, and to enable batching the Kafka producer has an asynchronous mode that accumulates data in memory and sends out larger batches in a single request. The batching can be configured to accumulate no more than a fixed number of messages and to wait no longer than some fixed latency bound (say 100 messages or 5 seconds). This allows the accumulation of more bytes to send, and few larger I/O operations on the servers. Since this buffering happens in the client it obviously reduces the durability as any data buffered in memory and not yet sent will be lost in the event of a producer crash.
+<p>
+Note that as of Kafka 0.8.1 the async producer does not have a callback, which could be used to register handlers to catch send errors.  Adding such callback functionality is proposed for Kafka 0.9, see [Proposed Producer API](https://cwiki.apache.org/confluence/display/KAFKA/Client+Rewrite#ClientRewrite-ProposedProducerAPI).
 
 <h3><a id="theconsumer">4.5 The Consumer</a></h3>
 
-The Kafka consumer works by issuing "fetch" requests to the brokers leading the partitions it wants to consume. The consumer specifies its position in the log with each request and receives back a chunk of log beginning at that position. The consumer thus has significant control over this position and can rewind it to re-consume data if need be.
+The Kafka consumer works by issuing "fetch" requests to the brokers leading the partitions it wants to consume. The consumer specifies its offset in the log with each request and receives back a chunk of log beginning from that position. The consumer thus has significant control over this position and can rewind it to re-consume data if need be.
 
 <h4>Push vs. pull</h4>
 <p>
-An initial question we considered is whether consumers should pull data from brokers or brokers should push data to the consumer. In this respect Kafka follows a more traditional design, shared by most messaging systems, where data is pushed to the broker from the producer and pulled from the broker by the consumer. Some logging-centric systems, such as <a href="http://github.com/facebook/scribe">scribe</a> and <a href="http://github.com/cloudera/flume">flume</a> follow a very different push based path where  data is pushed downstream. There are pros and cons to both approaches. However a push-based system has difficulty dealing with diverse consumers as the broker controls the rate at which data is transferred. The goal is generally for the consumer to be able to consume at the maximum possible rate; unfortunately in a push system this means the consumer tends to be overwhelmed when its rate of consumption falls below the rate of production (a denial of service attack, in essence).
  A pull-based system has the nicer property that the consumer simply falls behind and catches up when it can. This can be mitigated with some kind of backoff protocol by which the consumer can indicate it is overwhelmed, but getting the rate of transfer to fully utilize (but never over-utilize) the consumer is trickier than it seems. Previous attempts at building systems in this fashion led us to go with a more traditional pull model.
+An initial question we considered is whether consumers should pull data from brokers or brokers should push data to the consumer. In this respect Kafka follows a more traditional design, shared by most messaging systems, where data is pushed to the broker from the producer and pulled from the broker by the consumer. Some logging-centric systems, such as <a href="http://github.com/facebook/scribe">Scribe</a> and <a href="http://flume.apache.org/">Apache Flume</a> follow a very different push based path where  data is pushed downstream. There are pros and cons to both approaches. However a push-based system has difficulty dealing with diverse consumers as the broker controls the rate at which data is transferred. The goal is generally for the consumer to be able to consume at the maximum possible rate; unfortunately in a push system this means the consumer tends to be overwhelmed when its rate of consumption falls below the rate of production (a denial of service attack, in essence). 
 A pull-based system has the nicer property that the consumer simply falls behind and catches up when it can. This can be mitigated with some kind of backoff protocol by which the consumer can indicate it is overwhelmed, but getting the rate of transfer to fully utilize (but never over-utilize) the consumer is trickier than it seems. Previous attempts at building systems in this fashion led us to go with a more traditional pull model.
 <p>
 Another advantage of a pull-based system is that it lends itself to aggressive batching of data sent to the consumer. A push-based system must choose to either send a request immediately or accumulate more data and then send it later without knowledge of whether the downstream consumer will be able to immediately process it. If tuned for low latency this will result in sending a single message at a time only for the transfer to end up being buffered anyway, which is wasteful. A pull-based design fixes this as the consumer always pulls all available messages after its current position in the log (or up to some configurable max size). So one gets optimal batching without introducing unnecessary latency.
 <p>
@@ -133,13 +135,13 @@ In the case of Hadoop we parallelize the
 Now that we understand a little about how producers and consumers work, let's discuss the semantic guarantees Kafka provides between producer and consumer. Clearly there are multiple possible message delivery guarantees that could be provided:
 <ul>
   <li>
-	<i>At most once</i>&mdash;Messages may be lost but are never redelivered.
+    <i>At most once</i>&mdash;Messages may be lost but are never redelivered.
   </li>
   <li>
-	<i>At least once</i>&mdash;Messages are never lost but may be redelivered.
+    <i>At least once</i>&mdash;Messages are never lost but may be redelivered.
   </li>
   <li>
-	<i>Exactly once</i>&mdash;this is what people actually want, each message is delivered once and only once.
+    <i>Exactly once</i>&mdash;this is what people actually want, each message is delivered once and only once.
   </li>
 </ul>	
 
@@ -174,8 +176,8 @@ Followers consume messages from the lead
 <p>
 As with most distributed systems automatically handling failures requires having a precise definition of what it means for a node to be "alive". For Kafka node liveness has two conditions
 <ol>
-	<li>A node must be able to maintain its session with Zookeeper (via Zookeeper's heartbeat mechanism)
-	<li>If it is a slave it must replicate the writes happening on the leader and not fall "too far" behind
+    <li>A node must be able to maintain its session with ZooKeeper (via ZooKeeper's heartbeat mechanism)
+    <li>If it is a slave it must replicate the writes happening on the leader and not fall "too far" behind
 </ol>
 We refer to nodes satisfying these two conditions as being "in sync" to avoid the vagueness of "alive" or "failed". The leader keeps track of the set of "in sync" nodes. If a follower dies, gets stuck, or falls behind, the leader will remove it from the list of in sync replicas. The definition of, how far behind is too far, is controlled by the replica.lag.max.messages configuration and the definition of a stuck replica is controlled by the replica.lag.time.max.ms configuration.
 <p>
@@ -201,9 +203,9 @@ A common approach to this tradeoff is to
 <p>
 This majority vote approach has a very nice property: the latency is dependent on only the fastest servers. That is, if the replication factor is three, the latency is determined by the faster slave not the slower one.
 <p>
-There are a rich variety of algorithms in this family including Zookeeper's <a href="http://www.stanford.edu/class/cs347/reading/zab.pdf">Zab</a>, <a href="https://ramcloud.stanford.edu/wiki/download/attachments/11370504/raft.pdf">Raft</a>, and <a href="http://pmg.csail.mit.edu/papers/vr-revisited.pdf">Viewstamped Replication</a>. The most similar academic publication we are aware of to Kafka's actual implementation is <a href="http://research.microsoft.com/apps/pubs/default.aspx?id=66814">PacificA</a> from Microsoft.
+There are a rich variety of algorithms in this family including ZooKeeper's <a href="http://www.stanford.edu/class/cs347/reading/zab.pdf">Zab</a>, <a href="https://ramcloud.stanford.edu/wiki/download/attachments/11370504/raft.pdf">Raft</a>, and <a href="http://pmg.csail.mit.edu/papers/vr-revisited.pdf">Viewstamped Replication</a>. The most similar academic publication we are aware of to Kafka's actual implementation is <a href="http://research.microsoft.com/apps/pubs/default.aspx?id=66814">PacificA</a> from Microsoft.
 <p>
-The downside of majority vote is that it doesn't take many failures to leave you with no electable leaders. To tolerate one failure requires three copies of the data, and to tolerate two failures requires five copies of the data. In our experience having only enough redundancy to tolerate a single failure is not enough for a practical system, but doing every write five times, with 5x the disk space requirements and 1/5th the throughput, is not very practical for large volume data problems. This is likely why quorum algorithms more commonly appear for shared cluster configuration such as Zookeeper but are less common for primary data storage. For example in HDFS the namenode's high-availability feature is built on a <a href="http://blog.cloudera.com/blog/2012/10/quorum-based-journaling-in-cdh4-1">majority-vote-based journal</a>, but this more expensive approach is not used for the data itself.
+The downside of majority vote is that it doesn't take many failures to leave you with no electable leaders. To tolerate one failure requires three copies of the data, and to tolerate two failures requires five copies of the data. In our experience having only enough redundancy to tolerate a single failure is not enough for a practical system, but doing every write five times, with 5x the disk space requirements and 1/5th the throughput, is not very practical for large volume data problems. This is likely why quorum algorithms more commonly appear for shared cluster configuration such as ZooKeeper but are less common for primary data storage. For example in HDFS the namenode's high-availability feature is built on a <a href="http://blog.cloudera.com/blog/2012/10/quorum-based-journaling-in-cdh4-1">majority-vote-based journal</a>, but this more expensive approach is not used for the data itself.
 <p>
 Kafka takes a slightly different approach to choosing its quorum set. Instead of majority vote, Kafka dynamically maintains a set of in-sync replicas (ISR) that are caught-up to the leader. Only members of this set are eligible for election as leader. A write to a Kafka partition is not considered committed until <i>all</i> in-sync replicas have received the write. This ISR set is persisted to zookeeper whenever it changes. Because of this, any replica in the ISR is eligible to be elected leader. This is an important factor for Kafka's usage model where there are many partitions and ensuring leadership balance is important. With this ISR model and <i>f+1</i> replicas, a Kafka topic can tolerate <i>f</i> failures without losing committed messages.
 <p>
@@ -217,8 +219,8 @@ Note that Kafka's guarantee with respect
 <p>
 However a practical system needs to do something reasonable when all the replicas die. If you are unlucky enough to have this occur, it is important to consider what will happen. There are two behaviors that could be implemented:
 <ol>
-	<li>Wait for a replica in the ISR to come back to life and choose this replica as the leader (hopefully it still has all its data).
-	<li>Choose the first replica (not necessarily in the ISR) that comes back to life as the leader.
+    <li>Wait for a replica in the ISR to come back to life and choose this replica as the leader (hopefully it still has all its data).
+    <li>Choose the first replica (not necessarily in the ISR) that comes back to life as the leader.
 </ol>
 <p>
 This is a simple tradeoff between availability and consistency. If we wait for replicas in the ISR, then we will remain unavailable as long as those replicas are down. If such replicas were destroyed or their data was lost, then we are permanently down. If, on the other hand, a non-in-sync replica comes back to life and we allow it to become leader, then its log becomes the source of truth even though it is not guaranteed to have every committed message. In our current release we choose the second strategy and favor choosing a potentially inconsistent replica when all replicas in the ISR are dead. In the future, we would like to make this configurable to better support use cases where downtime is preferable to inconsistency.
@@ -233,21 +235,21 @@ It is also important to optimize the lea
 
 <h3><a id="compaction">4.8 Log Compaction</a></h3>
 
-One of the ways Kafka is different from traditional messaging systems is that it maintains historical log data beyond just the currently in-flight messages. This allows a set of usage patterns and system architectures that make use of this as a kind of external commit log. Log compaction is a feature that helps to support this kind of use case.
+Log compaction ensures that Kafka will always retain at least the last known value for each message key within the log of data for a single topic partition.  It addresses use cases and scenarios such as restoring state after application crashes or system failure, or reloading caches after application restarts during operational maintenance. Let's dive into these use cases in more detail and then describe how compaction works.
 <p>
 So far we have described only the simpler approach to data retention where old log data is discarded after a fixed period of time or when the log reaches some predetermined size. This works well for temporal event data such as logging where each record stands alone. However an important class of data streams are the log of changes to keyed, mutable data (for example, the changes to a database table).
 <p>
-Let's discuss a concrete example of such a stream. Say we have a topic containing user email addresses; every time a user updates their email address we send a message to this topic using their user id as the primary key. Now say we send the following messages over some time period for a user with id 123, each message corresponding to a change in email address:
+Let's discuss a concrete example of such a stream. Say we have a topic containing user email addresses; every time a user updates their email address we send a message to this topic using their user id as the primary key. Now say we send the following messages over some time period for a user with id 123, each message corresponding to a change in email address (messages for other ids are omitted):
 <pre>
-	123 => bill@microsoft.com
-	        .
-	        .
-	        .
-	123 => bill@gatesfoundation.org
-	        .
-	        .
-	        .
-	123 => bill@gmail.com
+    123 => bill@microsoft.com
+            .
+            .
+            .
+    123 => bill@gatesfoundation.org
+            .
+            .
+            .
+    123 => bill@gmail.com
 </pre>
 Log compaction gives us a more granular retention mechanism so that we are guaranteed to retain at least the last update for each primary key (e.g. <code>bill@gmail.com</code>). By doing this we guarantee that the log contains a full snapshot of the final value for every key not just keys that changed recently. This means downstream consumers can restore their own state off this topic without us having to retain a complete log of all changes.
 <p>
@@ -265,7 +267,7 @@ The general idea is quite simple. If we 
 <p>
 Log compaction is a mechanism to give finer-grained per-record retention, rather than the coarser-grained time-based retention. The idea is to selectively remove records where we have a more recent update with the same primary key. This way the log is guaranteed to have at least the last state for each key.
 <p>
-This retention policy can be set per-topic, so topics retained by time can mix with topics which are compacted.
+This retention policy can be set per-topic, so a single cluster can have some topics where retention is enforced by size or time and other topics where retention is enforced by compaction.
 <p> 
 This functionality is inspired by one of LinkedIn's oldest and most successful pieces of infrastructure&mdash;a database changelog caching service called <a href="https://github.com/linkedin/databus">Databus</a>. Unlike most log-structured storage systems Kafka is built for subscription and organizes data for fast linear reads and writes. Unlike Databus, Kafka acts a source-of-truth store so it is useful even in situations where the upstream data source would not otherwise be replayable.
 
@@ -280,6 +282,7 @@ The head of the log is identical to a tr
 Compaction also allows for deletes. A message with a key and a null payload will be treated as a delete from the log. This delete marker will cause any prior message with that key to be removed (as would any new message with that key), but delete markers are special in that they will themselves be cleaned out of the log after a period of time to free up space. The point in time at which deletes are no longer retained is marked as the "delete retention point" in the above diagram.
 <p>
 The compaction is done in the background by periodically recopying log segments. Cleaning does not block reads and can be throttled to use no more than a configurable amount of I/O throughput to avoid impacting producers and consumers. The actual process of compacting a log segment looks something like this:
+<p>
 <img src="/images/log_compaction.png">
 <p>
 <h4>What guarantees does log compaction provide?</h4>
@@ -287,9 +290,10 @@ The compaction is done in the background
 Log compaction guarantees the following:
 <ol>
 <li>Any consumer that stays caught-up to within the head of the log will every message that is written and messages will have sequential offsets.
-<li>Message order is always maintained (that is, compaction will never re-order messages just remove some).
-<li>The offset for a message never changes, it is the permanent identifier for a position in the log.
+<li>Ordering of messages is always maintained.  Compaction will never re-order messages, just remove some.
+<li>The offset for a message never changes.  It is the permanent identifier for a position in the log.
 <li>Any read progressing from offset 0 will see at least the final state of all records in the order they were written. All delete markers for deleted records will be seen provided the reader reaches the head of the log in a time period less than the topic's delete.retention.ms setting (the default is 24 hours). This is important as delete marker removal happens concurrently with read (and thus it is important that we not remove any delete marker prior to the reader seeing it).
+<li>Any consumer progressing from the start of the log, will see at least the <em>final</em> state of all records in the order they were written.  All delete markers for deleted records will be seen provided the consumer reaches the head of the log in a time period less than the topic's <code>delete.retention.ms</code> setting (the default is 24 hours).  This is important as delete marker removal happens concurrently with read, and thus it is important that we do not remove any delete marker prior to the consumer seeing it.
 </ol>
 
 <h4>Log Compaction Details</h4>
@@ -312,7 +316,9 @@ This can be done either at topic creatio
 <p>
 Further cleaner configurations are described <a href="/documentation.html#brokerconfigs">here</a>.
 
-<h4>Limitations</h4>
-We haven't yet made it configurable how much log is retained without compaction (the "head" of the log), so currently all segments are eligible except for the last segment (the one currently being written to).
-<p>
-Log compaction is also not yet compatible with compression.
\ No newline at end of file
+<h4>Log Compaction Limitations</h4>
+
+<ol>
+  <li>You cannot configure yet how much log is retained without compaction (the "head" of the log).  Currently all segments are eligible except for the last segment, i.e. the one currently being written to.</li>
+  <li>Log compaction is not yet compatible with compressed topics.</li>
+</ol>
\ No newline at end of file

Modified: kafka/site/081/documentation.html
URL: http://svn.apache.org/viewvc/kafka/site/081/documentation.html?rev=1575736&r1=1575735&r2=1575736&view=diff
==============================================================================
--- kafka/site/081/documentation.html (original)
+++ kafka/site/081/documentation.html Sun Mar  9 18:32:49 2014
@@ -3,90 +3,91 @@
 <h1>Kafka 0.8.1 Documentation</h1>
 Prior releases: <a href="/07/documentation.html">0.7.x</a>, <a href="/08/documentation.html">0.8.0</a>.
 </ul>
-	
+    
 <ul class="toc">
     <li><a href="#gettingStarted">1. Getting Started</a>
          <ul>
              <li><a href="#introduction">1.1 Introduction</a>
-	         <li><a href="#uses">1.2 Use Cases</a>
+             <li><a href="#uses">1.2 Use Cases</a>
              <li><a href="#quickstart">1.3 Quick Start</a>
-	         <li><a href="#upgrade">1.4 Upgrading</a>
+             <li><a href="#ecosystem">1.4 Ecosystem</a>
+             <li><a href="#upgrade">1.5 Upgrading</a>
          </ul>
     <li><a href="#api">2. API</a>
-	      <ul>
-		      <li><a href="#producerapi">2.1 Producer API</a>
-			  <li><a href="#highlevelconsumerapi">2.2 High Level Consumer API</a>
-			  <li><a href="#simpleconsumerapi">2.3 Simple Consumer API</a>
-			  <li><a href="#kafkahadoopconsumerapi">2.4 Kafka Hadoop Consumer API</a>
-		  </ul>
+          <ul>
+              <li><a href="#producerapi">2.1 Producer API</a>
+              <li><a href="#highlevelconsumerapi">2.2 High Level Consumer API</a>
+              <li><a href="#simpleconsumerapi">2.3 Simple Consumer API</a>
+              <li><a href="#kafkahadoopconsumerapi">2.4 Kafka Hadoop Consumer API</a>
+          </ul>
     <li><a href="#configuration">3. Configuration</a>
-	    <ul>
-		     <li><a href="#brokerconfigs">3.1 Broker Configs</a>
-			 <li><a href="#consumerconfigs">3.2 Consumer Configs</a>
-		     <li><a href="#producerconfigs">3.3 Producer Configs</a>
-		</ul>
+        <ul>
+             <li><a href="#brokerconfigs">3.1 Broker Configs</a>
+             <li><a href="#consumerconfigs">3.2 Consumer Configs</a>
+             <li><a href="#producerconfigs">3.3 Producer Configs</a>
+        </ul>
     <li><a href="#design">4. Design</a>
-	    <ul>
-		     <li><a href="#majordesignelements">4.1 Motivation</a>
-			 <li><a href="#persistence">4.2 Persistence</a>
-			 <li><a href="#maximizingefficiency">4.3 Efficiency</a>
-			 <li><a href="#theproducer">4.4 The Producer</a>
-			 <li><a href="#theconsumer">4.5 The Consumer</a>
-			 <li><a href="#semantics">4.6 Message Delivery Semantics</a>
-			 <li><a href="#replication">4.7 Replication</a>
-			 <li><a href="#compaction">4.8 Log Compaction</a>
-		</ul>
-	<li><a href="#implementation">5. Implementation</a>
-		<ul>
-			  <li><a href="#apidesign">5.1 API Design</a>
-			  <li><a href="#networklayer">5.2 Network Layer</a>
-			  <li><a href="#messages">5.3 Messages</a>
-			  <li><a href="#messageformat">5.4 Message format</a>
-			  <li><a href="#log">5.5 Log</a>
-			  <li><a href="#distributionimpl">5.6 Distribution</a>
-		</ul>
-	<li><a href="#operations">6. Operations</a>
-		<ul>
-			 <li><a href="#basic_ops">6.1 Basic Kafka Operations</a>
-				<ul>
-  			 		<li><a href="#basic_ops_add_topic">Adding and removing topics</a>
-			 		<li><a href="#basic_ops_modify_topic">Modifying topics</a>
-			 		<li><a href="#basic_ops_restarting">Graceful shutdown</a>
-			 		<li><a href="#basic_ops_leader_balancing">Balancing leadership</a>
-					<li><a href="#basic_ops_consumer_lag">Checking consumer position</a>
-			 		<li><a href="#basic_ops_mirror_maker">Mirroring data between clusters</a>
-			 		<li><a href="#basic_ops_cluster_expansion">Expanding your cluster</a>
-				</ul>
-			 <li><a href="#datacenters">6.2 Datacenters</a>
-			 <li><a href="#config">6.3 Important Configs</a>
-				 <ul>
-					 <li><a href="#serverconfig">Important Server Configs</a>
-					 <li><a href="#clientconfig">Important Client Configs</a>
-					 <li><a href="#prodconfig">A Production Server Configs</a>
-        		 </ul>
-     		  <li><a href="#java">6.4 Java Version</a>
-	 		  <li><a href="#hwandos">6.5 Hardware and OS</a>
-				<ul>
-					<li><a href="#os">OS</a>
-					<li><a href="#diskandfs">Disks and Filesystems</a>
-					<li><a href="#appvsosflush">Application vs OS Flush Management</a>
-					<li><a href="#linuxflush">Linux Flush Behavior</a>
-					<li><a href="#ext4">Ext4 Notes</a>
-				</ul>
-			  <li><a href="#monitoring">6.6 Monitoring</a>
-			  <li><a href="#zk">6.7 Zookeeper</a>
-				<ul>
-					<li><a href="#zkversion">Stable Version</a>
-					<li><a href="#zkops">Operationalization</a>
-				</ul>
-		</ul>
-	<li><a href="#tools">7. Tools</a>
+        <ul>
+             <li><a href="#majordesignelements">4.1 Motivation</a>
+             <li><a href="#persistence">4.2 Persistence</a>
+             <li><a href="#maximizingefficiency">4.3 Efficiency</a>
+             <li><a href="#theproducer">4.4 The Producer</a>
+             <li><a href="#theconsumer">4.5 The Consumer</a>
+             <li><a href="#semantics">4.6 Message Delivery Semantics</a>
+             <li><a href="#replication">4.7 Replication</a>
+             <li><a href="#compaction">4.8 Log Compaction</a>
+        </ul>
+    <li><a href="#implementation">5. Implementation</a>
+        <ul>
+              <li><a href="#apidesign">5.1 API Design</a>
+              <li><a href="#networklayer">5.2 Network Layer</a>
+              <li><a href="#messages">5.3 Messages</a>
+              <li><a href="#messageformat">5.4 Message format</a>
+              <li><a href="#log">5.5 Log</a>
+              <li><a href="#distributionimpl">5.6 Distribution</a>
+        </ul>
+    <li><a href="#operations">6. Operations</a>
+        <ul>
+             <li><a href="#basic_ops">6.1 Basic Kafka Operations</a>
+                <ul>
+                     <li><a href="#basic_ops_add_topic">Adding and removing topics</a>
+                     <li><a href="#basic_ops_modify_topic">Modifying topics</a>
+                     <li><a href="#basic_ops_restarting">Graceful shutdown</a>
+                     <li><a href="#basic_ops_leader_balancing">Balancing leadership</a>
+                     <li><a href="#basic_ops_consumer_lag">Checking consumer position</a>
+                     <li><a href="#basic_ops_mirror_maker">Mirroring data between clusters</a>
+                     <li><a href="#basic_ops_cluster_expansion">Expanding your cluster</a>
+                </ul>
+             <li><a href="#datacenters">6.2 Datacenters</a>
+             <li><a href="#config">6.3 Important Configs</a>
+                 <ul>
+                     <li><a href="#serverconfig">Important Server Configs</a>
+                     <li><a href="#clientconfig">Important Client Configs</a>
+                     <li><a href="#prodconfig">A Production Server Configs</a>
+                 </ul>
+               <li><a href="#java">6.4 Java Version</a>
+               <li><a href="#hwandos">6.5 Hardware and OS</a>
+                <ul>
+                    <li><a href="#os">OS</a>
+                    <li><a href="#diskandfs">Disks and Filesystems</a>
+                    <li><a href="#appvsosflush">Application vs OS Flush Management</a>
+                    <li><a href="#linuxflush">Linux Flush Behavior</a>
+                    <li><a href="#ext4">Ext4 Notes</a>
+                </ul>
+              <li><a href="#monitoring">6.6 Monitoring</a>
+              <li><a href="#zk">6.7 ZooKeeper</a>
+                <ul>
+                    <li><a href="#zkversion">Stable Version</a>
+                    <li><a href="#zkops">Operationalization</a>
+                </ul>
+        </ul>
 </ul>
 
 <h2><a id="gettingStarted">1. Getting Started</a></h2>
 <!--#include virtual="introduction.html" -->
 <!--#include virtual="uses.html" -->
 <!--#include virtual="quickstart.html" -->
+<!--#include virtual="ecosystem" -->
 <!--#include virtual="upgrade.html" -->
 
 <h2><a id="api">2. API</a></h2>
@@ -109,8 +110,4 @@ Prior releases: <a href="/07/documentati
 
 <!--#include virtual="ops.html" -->
 
-<h2><a id="tools">7. Tools</a></h2>
-
-<!--#include virtual="tools.html" -->
-
 <!--#include virtual="../includes/footer.html" -->

Added: kafka/site/081/ecosystem.html
URL: http://svn.apache.org/viewvc/kafka/site/081/ecosystem.html?rev=1575736&view=auto
==============================================================================
--- kafka/site/081/ecosystem.html (added)
+++ kafka/site/081/ecosystem.html Sun Mar  9 18:32:49 2014
@@ -0,0 +1,3 @@
+<h3><a id="upgrade">1.4 Ecosystem</a></h3>
+
+There are a plethora of tools that integrate with Kafka outside the main distribution. The <a href="https://cwiki.apache.org/confluence/display/KAFKA/Ecosystem"> ecosystem page</a> lists many of these, including stream processing systems, Hadoop integration, monitoring, and deployment tools.

Modified: kafka/site/081/implementation.html
URL: http://svn.apache.org/viewvc/kafka/site/081/implementation.html?rev=1575736&r1=1575735&r2=1575736&view=diff
==============================================================================
--- kafka/site/081/implementation.html (original)
+++ kafka/site/081/implementation.html Sun Mar  9 18:32:49 2014
@@ -230,7 +230,7 @@ Note that two kinds of corruption must b
 </p>
 
 <h3><a id="distributionimpl">5.6 Distribution</a></h3>
-<h4>Zookeeper Directories</h4>
+<h4>ZooKeeper Directories</h4>
 <p>
 The following gives the zookeeper structures and algorithms used for co-ordination between consumers and brokers.
 </p>
@@ -261,7 +261,7 @@ Each broker registers itself under the t
 
 <h4>Consumers and Consumer Groups</h4>
 <p>
-Consumers of topics also register themselves in Zookeeper, in order to balance the consumption of data and track their offsets in each partition for each broker they consume from.
+Consumers of topics also register themselves in ZooKeeper, in order to balance the consumption of data and track their offsets in each partition for each broker they consume from.
 </p>
 
 <p>

Modified: kafka/site/081/introduction.html
URL: http://svn.apache.org/viewvc/kafka/site/081/introduction.html?rev=1575736&r1=1575735&r2=1575736&view=diff
==============================================================================
--- kafka/site/081/introduction.html (original)
+++ kafka/site/081/introduction.html Sun Mar  9 18:32:49 2014
@@ -16,7 +16,7 @@ So, at a high level, producers send mess
   <img src="../images/producer_consumer.png">
 </div>
 
-Communication between the clients and the servers is done with a simple, high-performance, language agnostic <a href="https://cwiki.apache.org/confluence/display/KAFKA/A+Guide+To+The+Kafka+Protocol">TCP protocol</a>. We provide a java client for Kafka, but clients are available in <a href="https://cwiki.apache.org/confluence/display/KAFKA/Clients">many languages</a>.
+Communication between the clients and the servers is done with a simple, high-performance, language agnostic <a href="https://cwiki.apache.org/confluence/display/KAFKA/A+Guide+To+The+Kafka+Protocol">TCP protocol</a>. We provide a Java client for Kafka, but clients are available in <a href="https://cwiki.apache.org/confluence/display/KAFKA/Clients">many languages</a>.
 
 <h4>Topics and Logs</h4>
 Let's first dive into the high-level abstraction Kafka provides&mdash;the topic.
@@ -43,7 +43,7 @@ Each partition has one server which acts
 
 <h4>Producers</h4>
 
-Producers publish data to the topics of their choice. The producer is able to choose which message to assign to which partition within the topic. This can be done in a round-robin fashion simply to balance load or it can be done according to some semantic partition function (say based on some key in the message). More on the use of partitioning in a second.
+Producers publish data to the topics of their choice. The producer is responsible for choosing which message to assign to which partition within the topic. This can be done in a round-robin fashion simply to balance load or it can be done according to some semantic partition function (say based on some key in the message). More on the use of partitioning in a second.
 
 <h4><a id="intro_consumers">Consumers</a></h4>
 
@@ -69,7 +69,7 @@ A traditional queue retains messages in-
 <p>
 Kafka does it better. By having a notion of parallelism&mdash;the partition&mdash;within the topics, Kafka is able to provide both ordering guarantees and load balancing over a pool of consumer processes. This is achieved by assigning the partitions in the topic to the consumers in the consumer group so that each partition is consumed by exactly one consumer in the group. By doing this we ensure that the consumer is the only reader of that partition and consumes the data in order. Since there are many partitions this still balances the load over many consumer instances. Note however that there cannot be more consumer instances than partitions.
 <p>
-Not that partitioning means Kafka only provides a total order over messages <i>within</i> a partition. This combined with the ability to partition data by key is sufficient for the vast majority of applications. However, if you require a total order over messages this can be achieved with a topic that has only one partition, though this will mean only one consumer process.
+Kafka only provides a total order over messages <i>within</i> a partition, not between different partitions in a topic. Per-partition ordering combined with the ability to partition data by key is sufficient for most applications. However, if you require a total order over messages this can be achieved with a topic that has only one partition, though this will mean only one consumer process.
 
 <h4>Guarantees</h4>
 

Modified: kafka/site/081/migration.html
URL: http://svn.apache.org/viewvc/kafka/site/081/migration.html?rev=1575736&r1=1575735&r2=1575736&view=diff
==============================================================================
--- kafka/site/081/migration.html (original)
+++ kafka/site/081/migration.html Sun Mar  9 18:32:49 2014
@@ -1,7 +1,7 @@
 <!--#include virtual="../includes/header.html" -->
 <h2>Migrating from 0.7.x to 0.8</h2>
 
-0.8 is our first (and hopefully last) release with a non-backwards-compatible wire protocol, zookeeper layout, and on-disk data format. This was a chance for us to clean up a lot of cruft and start fresh. This means performing a no-downtime upgrade is more painful than normal&mdash;you cannot just swap in the new code in-place.
+0.8 is our first (and hopefully last) release with a non-backwards-compatible wire protocol, ZooKeeper layout, and on-disk data format. This was a chance for us to clean up a lot of cruft and start fresh. This means performing a no-downtime upgrade is more painful than normal&mdash;you cannot just swap in the new code in-place.
 
 <h3>Migration Steps</h3>
 

Modified: kafka/site/081/ops.html
URL: http://svn.apache.org/viewvc/kafka/site/081/ops.html?rev=1575736&r1=1575735&r2=1575736&view=diff
==============================================================================
--- kafka/site/081/ops.html (original)
+++ kafka/site/081/ops.html Sun Mar  9 18:32:49 2014
@@ -137,9 +137,9 @@ For applications that need a global view
 <p>
 This is not the only possible deployment pattern. It is possible to read from or write to a remote Kafka cluster over the WAN, though obviously this will add whatever latency is required to get the cluster.
 <p>
-Kafka naturally batches data in both the producer and consumer so it can achieve high-throughput even over a high-latency connection. To allow this though it may be necessary to increase the TCP socket buffer sizes for the producer, consumer, and broker using the <code>socket.send.buffer.bytes</code> and <code>socket.receive.buffer.bytes</code> configurations. The appropriate way to set this is documented <a hef="http://en.wikipedia.org/wiki/Bandwidth-delay_product">here</a>.
+Kafka naturally batches data in both the producer and consumer so it can achieve high-throughput even over a high-latency connection. To allow this though it may be necessary to increase the TCP socket buffer sizes for the producer, consumer, and broker using the <code>socket.send.buffer.bytes</code> and <code>socket.receive.buffer.bytes</code> configurations. The appropriate way to set this is documented <a href="http://en.wikipedia.org/wiki/Bandwidth-delay_product">here</a>.
 <p>
-It is generally <i>not</i> advisable to run a <i>single</i> Kafka cluster that spans multiple datacenters over a high-latency link. This will incur very high replication latency both for Kafka writes and Zookeeper writes, and neither Kafka nor Zookeeper will remain available in all locations if the network between locations is unavailable.
+It is generally <i>not</i> advisable to run a <i>single</i> Kafka cluster that spans multiple datacenters over a high-latency link. This will incur very high replication latency both for Kafka writes and ZooKeeper writes, and neither Kafka nor ZooKeeper will remain available in all locations if the network between locations is unavailable.
 
 <h3><a id="config">6.3 Kafka Configuration</a></h3>
 
@@ -147,7 +147,7 @@ It is generally <i>not</i> advisable to 
 The most important producer configurations control
 <ul>
 	<li>compression</li>
-	<li>sync vs async production></li>
+	<li>sync vs async production</li>
 	<li>batch size (for async producers)</li>
 </ul>
 The most important consumer configuration is the fetch size.
@@ -264,7 +264,7 @@ Pdflush has a configurable policy that c
 <p>
 You can see the current state of OS memory usage by doing
 <pre>
-  cat /proc/meminfo
+  &gt; cat /proc/meminfo
 </pre>
 The meaning of these values are described in the link above.
 <p>
@@ -427,24 +427,20 @@ On the client side, we recommend monitor
 <h4>Audit</h4>
 The final alerting we do is on the correctness of the data delivery. We audit that every message that is sent is consumed by all consumers and measure the lag for this to occur. For important topics we alert if a certain completeness is not achieved in a certain time period. The details of this are discussed in KAFKA-260.
 
-<h3><a id="zk">6.7 Zookeeper</a></h3>
+<h3><a id="zk">6.7 ZooKeeper</a></h3>
 
 <h4><a id="zkversion">Stable version</a></h4>
-At LinkedIn, we are running Zookeeper 3.3.*. Version 3.3.3 has known serious issues regarding ephemeral node deletion and session expirations. After running into those issues in production, we upgraded to 3.3.4 and have been running that smoothly for over a year now.
+At LinkedIn, we are running ZooKeeper 3.3.*. Version 3.3.3 has known serious issues regarding ephemeral node deletion and session expirations. After running into those issues in production, we upgraded to 3.3.4 and have been running that smoothly for over a year now.
 
-<h4><a id="zkops">Operationalizing Zookeeper</a></h4>
-Operationally, we do the following for a healthy Zookeeper installation:
-<p>
-Redundancy in the physical/hardware/network layout: try not to put them all in the same rack, decent (but don't go nuts) hardware, try to keep redundant power and network paths, etc
-<p>
-I/O segregation: if you do a lot of write type traffic you'll almost definitely want the transaction logs on a different disk group than application logs and snapshots (the write to the Zookeeper service has a synchronous write to disk, which can be slow).
-<p>
-Application segregation: Unless you really understand the application patterns of other apps that you want to install on the same box, it can be a good idea to run Zookeeper in isolation (though this can be a balancing act with the capabilities of the hardware).
-Use care with virtualization: It can work, depending on your cluster layout and read/write patterns and SLAs, but the tiny overheads introduced by the virtualization layer can add up and throw off Zookeeper, as it can be very time sensitive
-<p>
-Zookeeper configuration and monitoring: It's java, make sure you give it 'enough' heap space (We usually run them with 3-5G, but that's mostly due to the data set size we have here). Unfortunately we don't have a good formula for it. As far as monitoring, both JMZ and the 4 letter commands are very useful, they do overlap in some cases (and in those cases we prefer the 4 letter commands, they seem more predictable, or at the very least, they work better with the LI monitoring infrastructure)
-Don't overbuild the cluster: large clusters, especially in a write heavy usage pattern, means a lot of intracluster communication (quorums on the writes and subsequent cluster member updates), but don't underbuild it (and risk swamping the cluster).
-<p>
-Try to run on a 3-5 node cluster: Zookeeper writes use quorums and inherently that means having an odd number of machines in a cluster. Remember that a 5 node cluster will cause writes to slow down compared to a 3 node cluster, but will allow more fault tolerance.
-<p>
-Overall, we try to keep the Zookeeper system as small as will handle the load (plus standard growth capacity planning) and as simple as possible. We try not to do anything fancy with the configuration or application layout as compared to the official release as well as keep it as self contained as possible. For these reasons, we tend to skip the OS packaged versions, since it has a tendency to try to put things in the OS standard hierarchy, which can be 'messy', for want of a better way to word it.
+<h4><a id="zkops">Operationalizing ZooKeeper</a></h4>
+Operationally, we do the following for a healthy ZooKeeper installation:
+<ul>
+  <li>Redundancy in the physical/hardware/network layout: try not to put them all in the same rack, decent (but don't go nuts) hardware, try to keep redundant power and network paths, etc.</li>
+  <li>I/O segregation: if you do a lot of write type traffic you'll almost definitely want the transaction logs on a different disk group than application logs and snapshots (the write to the ZooKeeper service has a synchronous write to disk, which can be slow).</li>
+  <li>Application segregation: Unless you really understand the application patterns of other apps that you want to install on the same box, it can be a good idea to run ZooKeeper in isolation (though this can be a balancing act with the capabilities of the hardware).</li>
+  <li>Use care with virtualization: It can work, depending on your cluster layout and read/write patterns and SLAs, but the tiny overheads introduced by the virtualization layer can add up and throw off ZooKeeper, as it can be very time sensitive</li>
+  <li>ZooKeeper configuration and monitoring: It's java, make sure you give it 'enough' heap space (We usually run them with 3-5G, but that's mostly due to the data set size we have here). Unfortunately we don't have a good formula for it. As far as monitoring, both JMZ and the 4 letter commands are very useful, they do overlap in some cases (and in those cases we prefer the 4 letter commands, they seem more predictable, or at the very least, they work better with the LI monitoring infrastructure)</li>
+  <li>Don't overbuild the cluster: large clusters, especially in a write heavy usage pattern, means a lot of intracluster communication (quorums on the writes and subsequent cluster member updates), but don't underbuild it (and risk swamping the cluster).</li>
+  <li>Try to run on a 3-5 node cluster: ZooKeeper writes use quorums and inherently that means having an odd number of machines in a cluster. Remember that a 5 node cluster will cause writes to slow down compared to a 3 node cluster, but will allow more fault tolerance.</li>
+</ul>
+Overall, we try to keep the ZooKeeper system as small as will handle the load (plus standard growth capacity planning) and as simple as possible. We try not to do anything fancy with the configuration or application layout as compared to the official release as well as keep it as self contained as possible. For these reasons, we tend to skip the OS packaged versions, since it has a tendency to try to put things in the OS standard hierarchy, which can be 'messy', for want of a better way to word it.

Modified: kafka/site/081/quickstart.html
URL: http://svn.apache.org/viewvc/kafka/site/081/quickstart.html?rev=1575736&r1=1575735&r2=1575736&view=diff
==============================================================================
--- kafka/site/081/quickstart.html (original)
+++ kafka/site/081/quickstart.html Sun Mar  9 18:32:49 2014
@@ -5,7 +5,7 @@
 <a href="../downloads.html" title="Kafka downloads">Download</a> the 0.8.1 release.
 
 <pre>
-&gt; <b>tar xzf kafka-&lt;VERSION&gt;.tgz</b>
+&gt; <b>tar -xzf kafka-&lt;VERSION&gt;.tgz</b>
 &gt; <b>cd kafka-&lt;VERSION&gt;</b>
 &gt; <b>./sbt update</b>
 &gt; <b>./sbt package</b>
@@ -55,7 +55,7 @@ Run the producer and then type a few mes
 
 <pre>
 &gt; <b>bin/kafka-console-producer.sh --broker-list localhost:9092 --topic test</b> 
-<b>This is a message</a>
+<b>This is a message</b>
 <b>This is another message</b>
 </pre>
 

Modified: kafka/site/081/upgrade.html
URL: http://svn.apache.org/viewvc/kafka/site/081/upgrade.html?rev=1575736&r1=1575735&r2=1575736&view=diff
==============================================================================
--- kafka/site/081/upgrade.html (original)
+++ kafka/site/081/upgrade.html Sun Mar  9 18:32:49 2014
@@ -1,8 +1,8 @@
-<h3><a id="upgrade">Upgrading From Previous Versions</a></h3>
-<h4>Upgrading from 0.7</h4>
+<h3><a id="upgrade">1.5 Upgrading From Previous Versions</a></h3>
+<h4>Upgrading from 0.8.0 to 0.8.1</h4>
 
-0.8, the release in which added replication, was our first backwards-incompatible release. The upgrade from 0.7 to 0.8.x requires a <a href="https://cwiki.apache.org/confluence/display/KAFKA/Migrating+from+0.7+to+0.8">special tool</a> for migration. This migration can be done without downtime.
+0.8.1 is fully compatible with 0.8. The upgrade can be done one broker at a time by simply bringing it down, updating the code, and restarting it.
 
-<h4>Upgrading from 0.8 to 0.8.1</h4>
+<h4>Upgrading from 0.7</h4>
 
-0.8.1 is fully compatible with 0.8. The upgrade can be done one broker at a time by simply bringing it down, updating the code, and restarting it.
+0.8, the release in which added replication, was our first backwards-incompatible release: major changes were made to the API, ZooKeeper data structures, and protocol, and configuration. The upgrade from 0.7 to 0.8.x requires a <a href="https://cwiki.apache.org/confluence/display/KAFKA/Migrating+from+0.7+to+0.8">special tool</a> for migration. This migration can be done without downtime.