You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by bh...@apache.org on 2020/01/23 06:32:42 UTC

[incubator-hudi] branch asf-site updated: Adding delete docs to QuickStart

This is an automated email from the ASF dual-hosted git repository.

bhavanisudha pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/incubator-hudi.git


The following commit(s) were added to refs/heads/asf-site by this push:
     new 5964367  Adding delete docs to QuickStart
5964367 is described below

commit 5964367084a2cc7d4e540b8cd235e08b3774e5af
Author: Sivabalan Narayanan <si...@uber.com>
AuthorDate: Sat Jan 18 13:37:27 2020 -0500

    Adding delete docs to QuickStart
---
 docs/_docs/0.5.0/1_1_quick_start_guide.md |  2 +-
 docs/_docs/1_1_quick_start_guide.md       | 35 ++++++++++++++++++++++++++++++-
 2 files changed, 35 insertions(+), 2 deletions(-)

diff --git a/docs/_docs/0.5.0/1_1_quick_start_guide.md b/docs/_docs/0.5.0/1_1_quick_start_guide.md
index 5358aa0..c938f8e 100644
--- a/docs/_docs/0.5.0/1_1_quick_start_guide.md
+++ b/docs/_docs/0.5.0/1_1_quick_start_guide.md
@@ -159,7 +159,7 @@ val incViewDF = spark.read.format("org.apache.hudi").
     load(basePath);
 incViewDF.registerTempTable("hudi_incr_table")
 spark.sql("select `_hoodie_commit_time`, fare, begin_lon, begin_lat, ts from  hudi_incr_table where fare > 20.0").show()
-``` 
+```
 
 ## Where to go from here?
 
diff --git a/docs/_docs/1_1_quick_start_guide.md b/docs/_docs/1_1_quick_start_guide.md
index e7c7f37..708bd2c 100644
--- a/docs/_docs/1_1_quick_start_guide.md
+++ b/docs/_docs/1_1_quick_start_guide.md
@@ -158,7 +158,40 @@ val tripsPointInTimeDF = spark.read.format("org.apache.hudi").
     load(basePath);
 tripsPointInTimeDF.registerTempTable("hudi_trips_point_in_time")
 spark.sql("select `_hoodie_commit_time`, fare, begin_lon, begin_lat, ts from  hudi_trips_point_in_time where fare > 20.0").show()
-``` 
+```
+
+## Delete data {#deletes}
+Delete records for the HoodieKeys passed in.
+
+```
+// fetch total records count
+spark.sql("select uuid, partitionPath from hudi_ro_table").count()
+// fetch two records to be deleted
+val ds = spark.sql("select uuid, partitionPath from hudi_ro_table").limit(2)
+
+// issue deletes
+val deletes = dataGen.generateDeletes(ds.collectAsList())
+val df = spark.read.json(spark.sparkContext.parallelize(deletes, 2));
+df.write.format("org.apache.hudi").
+options(getQuickstartWriteConfigs).
+option(OPERATION_OPT_KEY,"delete").
+option(PRECOMBINE_FIELD_OPT_KEY, "ts").
+option(RECORDKEY_FIELD_OPT_KEY, "uuid").
+option(PARTITIONPATH_FIELD_OPT_KEY, "partitionpath").
+option(TABLE_NAME, tableName).
+mode(Append).
+save(basePath);
+
+// run the same read query as above.
+val roAfterDeleteViewDF = spark.
+    read.
+    format("org.apache.hudi").
+    load(basePath + "/*/*/*/*")
+roAfterDeleteViewDF.registerTempTable("hudi_ro_table")
+// fetch should return (total - 2) records
+spark.sql("select uuid, partitionPath from hudi_ro_table").count()
+```
+Note: Only `Append` mode is supported for delete operation.
 
 ## Where to go from here?