You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spark.apache.org by ka...@apache.org on 2020/11/05 07:45:31 UTC

[spark] branch branch-2.4 updated: [MINOR][SS][DOCS] Update join type in stream static joins code examples

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a commit to branch branch-2.4
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-2.4 by this push:
     new 8684720  [MINOR][SS][DOCS] Update join type in stream static joins code examples
8684720 is described below

commit 868472040a9eb67169086ed0fab0afcbbbd321f9
Author: Sarvesh Dave <sa...@gmail.com>
AuthorDate: Thu Nov 5 16:22:31 2020 +0900

    [MINOR][SS][DOCS] Update join type in stream static joins code examples
    
    ### What changes were proposed in this pull request?
    Update join type in stream static joins code examples in structured streaming programming guide.
    1) Scala, Java and Python examples have a common issue.
        The join keyword is "right_join", it should be "left_outer".
    
        _Reasons:_
        a) This code snippet is an example of "left outer join" as the streaming df is on left and static df is on right. Also, right outer    join between stream df(left) and static df(right) is not supported.
        b) The keyword "right_join/left_join" is unsupported and it should be "right_outer/left_outer".
    
    So, all of these code snippets have been updated to "left_outer".
    
    2) R exmaple is correct, but the example is of "right_outer" with static df (left) and streaming df(right).
    It is changed to "left_outer" to make it consistent with other three examples of scala, java and python.
    
    ### Why are the changes needed?
    To fix the mistake in example code of documentation.
    
    ### Does this PR introduce _any_ user-facing change?
    Yes, it is a user-facing change (but documentation update only).
    
    **Screenshots 1: Scala/Java/python example (similar issue)**
    _Before:_
    <img width="941" alt="Screenshot 2020-11-05 at 12 16 09 AM" src="https://user-images.githubusercontent.com/62717942/98155351-19e59400-1efc-11eb-8142-e6a25a5e6497.png">
    
    _After:_
    <img width="922" alt="Screenshot 2020-11-05 at 12 17 12 AM" src="https://user-images.githubusercontent.com/62717942/98155503-5d400280-1efc-11eb-96e1-5ba0f3c35c82.png">
    
    **Screenshots 2: R example (Make it consistent with above change)**
    _Before:_
    <img width="896" alt="Screenshot 2020-11-05 at 12 19 57 AM" src="https://user-images.githubusercontent.com/62717942/98155685-ac863300-1efc-11eb-93bc-b7ca4dd34634.png">
    
    _After:_
    <img width="919" alt="Screenshot 2020-11-05 at 12 20 51 AM" src="https://user-images.githubusercontent.com/62717942/98155739-c0ca3000-1efc-11eb-8f95-a7538fa784b7.png">
    
    ### How was this patch tested?
    The change was tested locally.
    1) cd docs/
        SKIP_API=1 jekyll build
    2) Verify docs/_site/structured-streaming-programming-guide.html file in browser.
    
    Closes #30252 from sarveshdave1/doc-update-stream-static-joins.
    
    Authored-by: Sarvesh Dave <sa...@gmail.com>
    Signed-off-by: Jungtaek Lim (HeartSaVioR) <ka...@gmail.com>
    (cherry picked from commit e66201b30bc1f3da7284af14b32e5e6200768dbd)
    Signed-off-by: Jungtaek Lim (HeartSaVioR) <ka...@gmail.com>
---
 docs/structured-streaming-programming-guide.md | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/docs/structured-streaming-programming-guide.md b/docs/structured-streaming-programming-guide.md
index dce4b35..aac262b 100644
--- a/docs/structured-streaming-programming-guide.md
+++ b/docs/structured-streaming-programming-guide.md
@@ -1089,7 +1089,7 @@ val staticDf = spark.read. ...
 val streamingDf = spark.readStream. ...
 
 streamingDf.join(staticDf, "type")          // inner equi-join with a static DF
-streamingDf.join(staticDf, "type", "right_join")  // right outer join with a static DF  
+streamingDf.join(staticDf, "type", "left_outer")  // left outer join with a static DF
 
 {% endhighlight %}
 
@@ -1100,7 +1100,7 @@ streamingDf.join(staticDf, "type", "right_join")  // right outer join with a sta
 Dataset<Row> staticDf = spark.read(). ...;
 Dataset<Row> streamingDf = spark.readStream(). ...;
 streamingDf.join(staticDf, "type");         // inner equi-join with a static DF
-streamingDf.join(staticDf, "type", "right_join");  // right outer join with a static DF
+streamingDf.join(staticDf, "type", "left_outer");  // left outer join with a static DF
 {% endhighlight %}
 
 
@@ -1111,7 +1111,7 @@ streamingDf.join(staticDf, "type", "right_join");  // right outer join with a st
 staticDf = spark.read. ...
 streamingDf = spark.readStream. ...
 streamingDf.join(staticDf, "type")  # inner equi-join with a static DF
-streamingDf.join(staticDf, "type", "right_join")  # right outer join with a static DF
+streamingDf.join(staticDf, "type", "left_outer")  # left outer join with a static DF
 {% endhighlight %}
 
 </div>
@@ -1123,10 +1123,10 @@ staticDf <- read.df(...)
 streamingDf <- read.stream(...)
 joined <- merge(streamingDf, staticDf, sort = FALSE)  # inner equi-join with a static DF
 joined <- join(
+            streamingDf,
             staticDf,
-            streamingDf, 
             streamingDf$value == staticDf$value,
-            "right_outer")  # right outer join with a static DF
+            "left_outer")  # left outer join with a static DF
 {% endhighlight %}
 
 </div>


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org