You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spark.apache.org by vi...@apache.org on 2021/10/28 16:23:49 UTC

[spark] branch master updated: [MINOR][SS][DOCS] Point to correct examples of Arbitrary Stateful Operations

This is an automated email from the ASF dual-hosted git repository.

viirya pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
     new 5b2bbce  [MINOR][SS][DOCS] Point to correct examples of Arbitrary Stateful Operations
5b2bbce is described below

commit 5b2bbcef6854c495c32b37e383dd5f1f6ce23dd4
Author: Liang-Chi Hsieh <vi...@gmail.com>
AuthorDate: Thu Oct 28 09:22:42 2021 -0700

    [MINOR][SS][DOCS] Point to correct examples of Arbitrary Stateful Operations
    
    ### What changes were proposed in this pull request?
    
    This fixes incorrect example links in Structured Streaming Programming Guide.
    
    ### Why are the changes needed?
    
    StructuredSessionization.scala and JavaStructuredSessionization.java are now using session window expression, not `flatMapGroupsWithState`. The section talks about arbitrary stateful operations and should point to another examples.
    
    ### Does this PR introduce _any_ user-facing change?
    
    No
    
    ### How was this patch tested?
    
    Doc change only.
    
    Closes #34408 from viirya/fix-ss-doc.
    
    Authored-by: Liang-Chi Hsieh <vi...@gmail.com>
    Signed-off-by: Liang-Chi Hsieh <vi...@gmail.com>
---
 docs/structured-streaming-programming-guide.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/structured-streaming-programming-guide.md b/docs/structured-streaming-programming-guide.md
index 6e98d5a..b36cdc7 100644
--- a/docs/structured-streaming-programming-guide.md
+++ b/docs/structured-streaming-programming-guide.md
@@ -1806,7 +1806,7 @@ However, as a side effect, data from the slower streams will be aggressively dro
 this configuration judiciously.
 
 ### Arbitrary Stateful Operations
-Many usecases require more advanced stateful operations than aggregations. For example, in many usecases, you have to track sessions from data streams of events. For doing such sessionization, you will have to save arbitrary types of data as state, and perform arbitrary operations on the state using the data stream events in every trigger. Since Spark 2.2, this can be done using the operation `mapGroupsWithState` and the more powerful operation `flatMapGroupsWithState`. Both operations a [...]
+Many usecases require more advanced stateful operations than aggregations. For example, in many usecases, you have to track sessions from data streams of events. For doing such sessionization, you will have to save arbitrary types of data as state, and perform arbitrary operations on the state using the data stream events in every trigger. Since Spark 2.2, this can be done using the operation `mapGroupsWithState` and the more powerful operation `flatMapGroupsWithState`. Both operations a [...]
 
 Though Spark cannot check and force it, the state function should be implemented with respect to the semantics of the output mode. For example, in Update mode Spark doesn't expect that the state function will emit rows which are older than current watermark plus allowed late record delay, whereas in Append mode the state function can emit these rows.
 

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org