You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pulsar.apache.org by GitBox <gi...@apache.org> on 2018/05/25 09:47:52 UTC

[GitHub] ivankelly commented on a change in pull request #1466: Topic compaction documentation

ivankelly commented on a change in pull request #1466: Topic compaction documentation
URL: https://github.com/apache/incubator-pulsar/pull/1466#discussion_r190840491
 
 

 ##########
 File path: site/docs/latest/cookbooks/compaction.md
 ##########
 @@ -0,0 +1,113 @@
+---
+title: Topic compaction cookbook
+tags: [admin, clients, compaction]
+---
+
+Pulsar's [topic compaction](../../getting-started/ConceptsAndArchitecture#compaction) feature enables you to create **compacted** topics in which older, "obscured" entries are pruned from the topic, allowing for faster reads through the topic's history (which messages are deemed obscured/outdated/irrelevant will depend on your use case).
+
+To use compaction:
+
+* You must manually [trigger](#trigger) compaction using the Pulsar administrative API. This will both run a compaction operation *and* mark the topic as a compacted topic.
+* Your {% popover consumers %} must be [configured](#config) to read from compacted topics (or else the messages won't be properly read/processed/acknowledged).
+
+In Pulsar, topic compaction takes place on a *per-key basis*, meaning that messages are compacted based on their key. For the stock ticker use case, the stock symbol---e.g. `AAPL` or `GOOG`---could serve as the key.
+
+## When should I use compacted topics?
+
+The classic example of a topic that could benefit from compaction would be a stock ticker topic through which {% popover consumers %} can access up-to-date values for specific stocks. On a stock ticker topic you only care about the most recent value of each stock; "historical values" don't matter, so there's no need to read through outdated data when processing a topic's messages. For topics where older values are important, for example when you need to process long series of messages in order, compaction is unnecessary and possibly even harmful.
+
+{% include admonition.html type="warning" content="Compaction only works on topics where each message has a key (as in the stock ticker example, where the stock symbol serves as the key). Keys can be thought of as the axis along which compaction is applied." %}
+
+## When should I trigger compaction?
+
+How often you trigger compaction will vary widely based on the use case. If you want a compacted topic to be extremely speedy on read, then you should run compaction fairly frequently.
+
+{% include admonition.html type="warning" title="No automatic compaction" content="Currently, all topic compaction in Pulsar must be initialized manually." %}
+
+## Which messages get compacted?
+
+When you [trigger](#trigger) compaction on a topic, all messages with the following
 
 Review comment:
   missing something at the end here.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services