You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by bh...@apache.org on 2020/01/23 06:32:42 UTC
[incubator-hudi] branch asf-site updated: Adding delete docs to
QuickStart
This is an automated email from the ASF dual-hosted git repository.
bhavanisudha pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/incubator-hudi.git
The following commit(s) were added to refs/heads/asf-site by this push:
new 5964367 Adding delete docs to QuickStart
5964367 is described below
commit 5964367084a2cc7d4e540b8cd235e08b3774e5af
Author: Sivabalan Narayanan <si...@uber.com>
AuthorDate: Sat Jan 18 13:37:27 2020 -0500
Adding delete docs to QuickStart
---
docs/_docs/0.5.0/1_1_quick_start_guide.md | 2 +-
docs/_docs/1_1_quick_start_guide.md | 35 ++++++++++++++++++++++++++++++-
2 files changed, 35 insertions(+), 2 deletions(-)
diff --git a/docs/_docs/0.5.0/1_1_quick_start_guide.md b/docs/_docs/0.5.0/1_1_quick_start_guide.md
index 5358aa0..c938f8e 100644
--- a/docs/_docs/0.5.0/1_1_quick_start_guide.md
+++ b/docs/_docs/0.5.0/1_1_quick_start_guide.md
@@ -159,7 +159,7 @@ val incViewDF = spark.read.format("org.apache.hudi").
load(basePath);
incViewDF.registerTempTable("hudi_incr_table")
spark.sql("select `_hoodie_commit_time`, fare, begin_lon, begin_lat, ts from hudi_incr_table where fare > 20.0").show()
-```
+```
## Where to go from here?
diff --git a/docs/_docs/1_1_quick_start_guide.md b/docs/_docs/1_1_quick_start_guide.md
index e7c7f37..708bd2c 100644
--- a/docs/_docs/1_1_quick_start_guide.md
+++ b/docs/_docs/1_1_quick_start_guide.md
@@ -158,7 +158,40 @@ val tripsPointInTimeDF = spark.read.format("org.apache.hudi").
load(basePath);
tripsPointInTimeDF.registerTempTable("hudi_trips_point_in_time")
spark.sql("select `_hoodie_commit_time`, fare, begin_lon, begin_lat, ts from hudi_trips_point_in_time where fare > 20.0").show()
-```
+```
+
+## Delete data {#deletes}
+Delete records for the HoodieKeys passed in.
+
+```
+// fetch total records count
+spark.sql("select uuid, partitionPath from hudi_ro_table").count()
+// fetch two records to be deleted
+val ds = spark.sql("select uuid, partitionPath from hudi_ro_table").limit(2)
+
+// issue deletes
+val deletes = dataGen.generateDeletes(ds.collectAsList())
+val df = spark.read.json(spark.sparkContext.parallelize(deletes, 2));
+df.write.format("org.apache.hudi").
+options(getQuickstartWriteConfigs).
+option(OPERATION_OPT_KEY,"delete").
+option(PRECOMBINE_FIELD_OPT_KEY, "ts").
+option(RECORDKEY_FIELD_OPT_KEY, "uuid").
+option(PARTITIONPATH_FIELD_OPT_KEY, "partitionpath").
+option(TABLE_NAME, tableName).
+mode(Append).
+save(basePath);
+
+// run the same read query as above.
+val roAfterDeleteViewDF = spark.
+ read.
+ format("org.apache.hudi").
+ load(basePath + "/*/*/*/*")
+roAfterDeleteViewDF.registerTempTable("hudi_ro_table")
+// fetch should return (total - 2) records
+spark.sql("select uuid, partitionPath from hudi_ro_table").count()
+```
+Note: Only `Append` mode is supported for delete operation.
## Where to go from here?