You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spark.apache.org by ka...@apache.org on 2020/08/22 12:39:22 UTC
[spark] branch branch-3.0 updated:
[SPARK-31792][SS][DOC][FOLLOW-UP] Rephrase the description for some
operations
This is an automated email from the ASF dual-hosted git repository.
kabhwan pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/branch-3.0 by this push:
new a6df16b [SPARK-31792][SS][DOC][FOLLOW-UP] Rephrase the description for some operations
a6df16b is described below
commit a6df16b36210da32359c77205920eaee98d3e232
Author: Yuanjian Li <yu...@databricks.com>
AuthorDate: Sat Aug 22 21:32:23 2020 +0900
[SPARK-31792][SS][DOC][FOLLOW-UP] Rephrase the description for some operations
### What changes were proposed in this pull request?
Rephrase the description for some operations to make it clearer.
### Why are the changes needed?
Add more detail in the document.
### Does this PR introduce _any_ user-facing change?
No, document only.
### How was this patch tested?
Document only.
Closes #29269 from xuanyuanking/SPARK-31792-follow.
Authored-by: Yuanjian Li <yu...@databricks.com>
Signed-off-by: Jungtaek Lim (HeartSaVioR) <ka...@gmail.com>
(cherry picked from commit 8b26c69ce7f9077775a3c7bbabb1c47ee6a51a23)
Signed-off-by: Jungtaek Lim (HeartSaVioR) <ka...@gmail.com>
---
docs/web-ui.md | 10 +++++-----
.../spark/sql/execution/streaming/MicroBatchExecution.scala | 3 +--
2 files changed, 6 insertions(+), 7 deletions(-)
diff --git a/docs/web-ui.md b/docs/web-ui.md
index 134a8c8..fe26043 100644
--- a/docs/web-ui.md
+++ b/docs/web-ui.md
@@ -425,11 +425,11 @@ queries. Currently, it contains the following metrics.
* **Batch Duration.** The process duration of each batch.
* **Operation Duration.** The amount of time taken to perform various operations in milliseconds.
The tracked operations are listed as follows.
- * addBatch: Adds result data of the current batch to the sink.
- * getBatch: Gets a new batch of data to process.
- * latestOffset: Gets the latest offsets for sources.
- * queryPlanning: Generates the execution plan.
- * walCommit: Writes the offsets to the metadata log.
+ * addBatch: Time taken to read the micro-batch's input data from the sources, process it, and write the batch's output to the sink. This should take the bulk of the micro-batch's time.
+ * getBatch: Time taken to prepare the logical query to read the input of the current micro-batch from the sources.
+ * latestOffset & getOffset: Time taken to query the maximum available offset for this source.
+ * queryPlanning: Time taken to generates the execution plan.
+ * walCommit: Time taken to write the offsets to the metadata log.
As an early-release version, the statistics page is still under development and will be improved in
future releases.
diff --git a/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/MicroBatchExecution.scala b/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/MicroBatchExecution.scala
index e022bfb..e0731db 100644
--- a/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/MicroBatchExecution.scala
+++ b/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/MicroBatchExecution.scala
@@ -566,8 +566,7 @@ class MicroBatchExecution(
val nextBatch =
new Dataset(lastExecution, RowEncoder(lastExecution.analyzed.schema))
- val batchSinkProgress: Option[StreamWriterCommitProgress] =
- reportTimeTaken("addBatch") {
+ val batchSinkProgress: Option[StreamWriterCommitProgress] = reportTimeTaken("addBatch") {
SQLExecution.withNewExecutionId(lastExecution) {
sink match {
case s: Sink => s.addBatch(currentBatchId, nextBatch)
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org