You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by fj...@apache.org on 2019/05/22 19:02:17 UTC

[incubator-druid] branch 0.15.0-incubating updated: Update tutorial to delete data (#7577) (#7732)

This is an automated email from the ASF dual-hosted git repository.

fjy pushed a commit to branch 0.15.0-incubating
in repository https://gitbox.apache.org/repos/asf/incubator-druid.git


The following commit(s) were added to refs/heads/0.15.0-incubating by this push:
     new f695425  Update tutorial to delete data (#7577) (#7732)
f695425 is described below

commit f695425c5b4c79ed945dbe58e2d32d74e5b47503
Author: Jihoon Son <ji...@apache.org>
AuthorDate: Wed May 22 10:48:56 2019 -0700

    Update tutorial to delete data (#7577) (#7732)
    
    * Update tutorial to delete data
    
    * update tutorial, remove old ways to drop data
    
    * PR comments
---
 .../content/tutorials/img/tutorial-deletion-02.png | Bin 200459 -> 810422 bytes
 .../content/tutorials/img/tutorial-deletion-03.png | Bin 0 -> 805673 bytes
 docs/content/tutorials/tutorial-delete-data.md     |  58 +++++++++++----------
 .../tutorial/deletion-disable-segments.json        |   7 +++
 4 files changed, 37 insertions(+), 28 deletions(-)

diff --git a/docs/content/tutorials/img/tutorial-deletion-02.png b/docs/content/tutorials/img/tutorial-deletion-02.png
index fdea20f..9b84f0c 100644
Binary files a/docs/content/tutorials/img/tutorial-deletion-02.png and b/docs/content/tutorials/img/tutorial-deletion-02.png differ
diff --git a/docs/content/tutorials/img/tutorial-deletion-03.png b/docs/content/tutorials/img/tutorial-deletion-03.png
new file mode 100644
index 0000000..e6fb1f3
Binary files /dev/null and b/docs/content/tutorials/img/tutorial-deletion-03.png differ
diff --git a/docs/content/tutorials/tutorial-delete-data.md b/docs/content/tutorials/tutorial-delete-data.md
index 41812f7..46fbbdc 100644
--- a/docs/content/tutorials/tutorial-delete-data.md
+++ b/docs/content/tutorials/tutorial-delete-data.md
@@ -29,8 +29,6 @@ This tutorial demonstrates how to delete existing data.
 For this tutorial, we'll assume you've already downloaded Apache Druid (incubating) as described in 
 the [single-machine quickstart](index.html) and have it running on your local machine. 
 
-Completing [Tutorial: Configuring retention](../tutorials/tutorial-retention.html) first is highly recommended, as we will be using retention rules in this tutorial.
-
 ## Load initial data
 
 In this tutorial, we will use the Wikipedia edits data, with an indexing spec that creates hourly segments. This spec is located at `quickstart/tutorial/deletion-index.json`, and it creates a datasource called `deletion-tutorial`.
@@ -47,30 +45,25 @@ When the load finishes, open [http://localhost:8888/unified-console.html#datasou
 
 Permanent deletion of a Druid segment has two steps:
 
-1. The segment must first be marked as "unused". This occurs when a segment is dropped by retention rules, and when a user manually disables a segment through the Coordinator API. This tutorial will cover both cases.
+1. The segment must first be marked as "unused". This occurs when a user manually disables a segment through the Coordinator API.
 2. After segments have been marked as "unused", a Kill Task will delete any "unused" segments from Druid's metadata store as well as deep storage.
 
-Let's drop some segments now, first with load rules, then manually.
-
-## Drop some data with load rules
-
-As with the previous retention tutorial, there are currently 24 segments in the `deletion-tutorial` datasource.
-
-click the blue pencil icon next to `Cluster default: loadForever` for the `deletion-tutorial` datasource.
+Let's drop some segments now, by using the coordinator API to drop data by interval and segmentIds.
 
-A rule configuration window will appear. 
+## Disable segments by interval
 
-Now click the `+ New rule` button twice. 
+Let's disable segments in a specified interval. This will mark all segments in the interval as "unused", but not remove them from deep storage.
+Let's disable segments in interval `2015-09-12T18:00:00.000Z/2015-09-12T20:00:00.000Z` i.e. between hour 18 and 20.
 
-In the upper rule box, select `Load` and `by interval`, and then enter `2015-09-12T12:00:00.000Z/2015-09-13T00:00:00.000Z` in field next to `by interval`. Replicants can remain at 2 in the `_default_tier`.
-
-In the lower rule box, select `Drop` and `forever`.
+```bash
+curl -X 'POST' -H 'Content-Type:application/json' -d '{ "interval" : "2015-09-12T18:00:00.000Z/2015-09-12T20:00:00.000Z" }' http://localhost:8081/druid/coordinator/v1/datasources/deletion-tutorial/markUnused
+```
 
-Now click `Next` and enter `tutorial` for both the user and changelog comment field.
+After that command completes, you should see that the segment for hour 18 and 19 have been disabled:
 
-This will cause the first 12 segments of `deletion-tutorial` to be dropped. However, these dropped segments are not removed from deep storage.
+![Segments 2](../tutorials/img/tutorial-deletion-02.png "Segments 2")
 
-You can see that all 24 segments are still present in deep storage by listing the contents of `apache-druid-#{DRUIDVERSION}/var/druid/segments/deletion-tutorial`:
+Note that the hour 18 and 19 segments are still present in deep storage:
 
 ```bash
 $ ls -l1 var/druid/segments/deletion-tutorial/
@@ -100,9 +93,9 @@ $ ls -l1 var/druid/segments/deletion-tutorial/
 2015-09-12T23:00:00.000Z_2015-09-13T00:00:00.000Z
 ```
 
-## Manually disable a segment
+## Disable segments by segment IDs
 
-Let's manually disable a segment now. This will mark a segment as "unused", but not remove it from deep storage.
+Let's disable some segments by their segmentID. This will again mark the segments as "unused", but not remove them from deep storage. You can see the full segmentID for a segment from UI as explained below.
 
 In the [segments view](http://localhost:8888/unified-console.html#segments), click the arrow on the left side of one of the remaining segments to expand the segment entry:
 
@@ -110,17 +103,29 @@ In the [segments view](http://localhost:8888/unified-console.html#segments), cli
 
 The top of the info box shows the full segment ID, e.g. `deletion-tutorial_2015-09-12T14:00:00.000Z_2015-09-12T15:00:00.000Z_2019-02-28T01:11:51.606Z` for the segment of hour 14.
 
-Let's disable the hour 14 segment by sending the following DELETE request to the Coordinator, where {SEGMENT-ID} is the full segment ID shown in the info box:
+Let's disable the hour 13 and 14 segments by sending a POST request to the Coordinator with this payload
+
+```json
+{
+  "segmentIds":
+  [
+    "deletion-tutorial_2015-09-12T13:00:00.000Z_2015-09-12T14:00:00.000Z_2019-05-01T17:38:46.961Z",
+    "deletion-tutorial_2015-09-12T14:00:00.000Z_2015-09-12T15:00:00.000Z_2019-05-01T17:38:46.961Z"
+  ]
+}
+```
+
+This payload json has been provided at `quickstart/tutorial/deletion-disable-segments.json`. Submit the POST request to Coordinator like this:
 
 ```bash
-curl -XDELETE http://localhost:8081/druid/coordinator/v1/datasources/deletion-tutorial/segments/{SEGMENT-ID}
+curl -X 'POST' -H 'Content-Type:application/json' -d @quickstart/tutorial/deletion-disable-segments.json http://localhost:8081/druid/coordinator/v1/datasources/deletion-tutorial/markUnused
 ```
 
-After that command completes, you should see that the segment for hour 14 has been disabled:
+After that command completes, you should see that the segments for hour 13 and 14 have been disabled:
 
-![Segments 2](../tutorials/img/tutorial-deletion-02.png "Segments 2")
+![Segments 3](../tutorials/img/tutorial-deletion-03.png "Segments 3")
 
-Note that the hour 14 segment is still in deep storage:
+Note that the hour 13 and 14 segments are still in deep storage:
 
 ```bash
 $ ls -l1 var/druid/segments/deletion-tutorial/
@@ -165,12 +170,9 @@ After this task completes, you can see that the disabled segments have now been
 ```bash
 $ ls -l1 var/druid/segments/deletion-tutorial/
 2015-09-12T12:00:00.000Z_2015-09-12T13:00:00.000Z
-2015-09-12T13:00:00.000Z_2015-09-12T14:00:00.000Z
 2015-09-12T15:00:00.000Z_2015-09-12T16:00:00.000Z
 2015-09-12T16:00:00.000Z_2015-09-12T17:00:00.000Z
 2015-09-12T17:00:00.000Z_2015-09-12T18:00:00.000Z
-2015-09-12T18:00:00.000Z_2015-09-12T19:00:00.000Z
-2015-09-12T19:00:00.000Z_2015-09-12T20:00:00.000Z
 2015-09-12T20:00:00.000Z_2015-09-12T21:00:00.000Z
 2015-09-12T21:00:00.000Z_2015-09-12T22:00:00.000Z
 2015-09-12T22:00:00.000Z_2015-09-12T23:00:00.000Z
diff --git a/examples/quickstart/tutorial/deletion-disable-segments.json b/examples/quickstart/tutorial/deletion-disable-segments.json
new file mode 100644
index 0000000..920e071
--- /dev/null
+++ b/examples/quickstart/tutorial/deletion-disable-segments.json
@@ -0,0 +1,7 @@
+{
+  "segmentIds":
+  [
+    "deletion-tutorial_2015-09-12T13:00:00.000Z_2015-09-12T14:00:00.000Z_2019-05-01T17:38:46.961Z",
+    "deletion-tutorial_2015-09-12T14:00:00.000Z_2015-09-12T15:00:00.000Z_2019-05-01T17:38:46.961Z"
+  ]
+}


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org