You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by wangyum <gi...@git.apache.org> on 2017/12/01 09:07:46 UTC
[GitHub] spark pull request #19858: [SPARK-22489][DOC][FOLLOWUP] Update broadcast beh...
GitHub user wangyum opened a pull request:
https://github.com/apache/spark/pull/19858
[SPARK-22489][DOC][FOLLOWUP] Update broadcast behavior changes in migration section
## What changes were proposed in this pull request?
Update broadcast behavior changes in migration section.
## How was this patch tested?
N/A
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/wangyum/spark SPARK-22489-migration
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/19858.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #19858
----
commit 4fedff1695d8ec6d524420e07779e2c808d652aa
Author: Yuming Wang <wg...@gmail.com>
Date: 2017-12-01T09:01:23Z
migration
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19858: [SPARK-22489][DOC][FOLLOWUP] Update broadcast behavior c...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19858
**[Test build #84383 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84383/testReport)** for PR 19858 at commit [`76148f5`](https://github.com/apache/spark/commit/76148f582404a9f5272ae55d9994133020832536).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19858: [SPARK-22489][DOC][FOLLOWUP] Update broadcast behavior c...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19858
**[Test build #84374 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84374/testReport)** for PR 19858 at commit [`4fedff1`](https://github.com/apache/spark/commit/4fedff1695d8ec6d524420e07779e2c808d652aa).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19858: [SPARK-22489][DOC][FOLLOWUP] Update broadcast behavior c...
Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/19858
Thanks! Merged to master.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19858: [SPARK-22489][DOC][FOLLOWUP] Update broadcast behavior c...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19858
**[Test build #84395 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84395/testReport)** for PR 19858 at commit [`069f8b6`](https://github.com/apache/spark/commit/069f8b6036fb3b23a00e9bb539d7128aedeaa084).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19858: [SPARK-22489][DOC][FOLLOWUP] Update broadcast behavior c...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19858
**[Test build #84383 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84383/testReport)** for PR 19858 at commit [`76148f5`](https://github.com/apache/spark/commit/76148f582404a9f5272ae55d9994133020832536).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19858: [SPARK-22489][DOC][FOLLOWUP] Update broadcast behavior c...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19858
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19858: [SPARK-22489][DOC][FOLLOWUP] Update broadcast behavior c...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19858
**[Test build #84395 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84395/testReport)** for PR 19858 at commit [`069f8b6`](https://github.com/apache/spark/commit/069f8b6036fb3b23a00e9bb539d7128aedeaa084).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19858: [SPARK-22489][DOC][FOLLOWUP] Update broadcast behavior c...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19858
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/84395/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19858: [SPARK-22489][DOC][FOLLOWUP] Update broadcast behavior c...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19858
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19858: [SPARK-22489][DOC][FOLLOWUP] Update broadcast behavior c...
Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/19858
LGTM
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19858: [SPARK-22489][DOC][FOLLOWUP] Update broadcast behavior c...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19858
**[Test build #84374 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84374/testReport)** for PR 19858 at commit [`4fedff1`](https://github.com/apache/spark/commit/4fedff1695d8ec6d524420e07779e2c808d652aa).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #19858: [SPARK-22489][DOC][FOLLOWUP] Update broadcast beh...
Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:
https://github.com/apache/spark/pull/19858
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19858: [SPARK-22489][DOC][FOLLOWUP] Update broadcast behavior c...
Posted by wangyum <gi...@git.apache.org>.
Github user wangyum commented on the issue:
https://github.com/apache/spark/pull/19858
cc @gatorsmile
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19858: [SPARK-22489][DOC][FOLLOWUP] Update broadcast behavior c...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19858
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/84374/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19858: [SPARK-22489][DOC][FOLLOWUP] Update broadcast behavior c...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19858
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #19858: [SPARK-22489][DOC][FOLLOWUP] Update broadcast beh...
Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:
https://github.com/apache/spark/pull/19858#discussion_r154490770
--- Diff: docs/sql-programming-guide.md ---
@@ -1776,6 +1776,8 @@ options.
Note that, for <b>DecimalType(38,0)*</b>, the table above intentionally does not cover all other combinations of scales and precisions because currently we only infer decimal type like `BigInteger`/`BigInt`. For example, 1.1 is inferred as double type.
- In PySpark, now we need Pandas 0.19.2 or upper if you want to use Pandas related functionalities, such as `toPandas`, `createDataFrame` from Pandas DataFrame, etc.
- In PySpark, the behavior of timestamp values for Pandas related functionalities was changed to respect session timezone. If you want to use the old behavior, you need to set a configuration `spark.sql.execution.pandas.respectSessionTimeZone` to `False`. See [SPARK-22395](https://issues.apache.org/jira/browse/SPARK-22395) for details.
+
+ - Since Spark 2.3, when either broadcast hash join or broadcast nested loop join is applicable, we prefer to broadcasting the table that is explicitly specified in a broadcast hint. For details, see the section [Broadcast Hint](#broadcast-hint-for-sql-queries) and [SPARK-22489](https://issues.apache.org/jira/browse/SPARK-22489) for details.
--- End diff --
Sorry, there is a duplicate `for details`
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #19858: [SPARK-22489][DOC][FOLLOWUP] Update broadcast beh...
Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:
https://github.com/apache/spark/pull/19858#discussion_r154453799
--- Diff: docs/sql-programming-guide.md ---
@@ -1776,6 +1776,8 @@ options.
Note that, for <b>DecimalType(38,0)*</b>, the table above intentionally does not cover all other combinations of scales and precisions because currently we only infer decimal type like `BigInteger`/`BigInt`. For example, 1.1 is inferred as double type.
- In PySpark, now we need Pandas 0.19.2 or upper if you want to use Pandas related functionalities, such as `toPandas`, `createDataFrame` from Pandas DataFrame, etc.
- In PySpark, the behavior of timestamp values for Pandas related functionalities was changed to respect session timezone. If you want to use the old behavior, you need to set a configuration `spark.sql.execution.pandas.respectSessionTimeZone` to `False`. See [SPARK-22395](https://issues.apache.org/jira/browse/SPARK-22395) for details.
+
+ - Since Spark 2.3, broadcast behaviour changed to broadcast the join side with an explicit broadcast hint first. See [SPARK-22489](https://issues.apache.org/jira/browse/SPARK-22489) for details.
--- End diff --
```
Since Spark 2.3, when either broadcast hash join or broadcast nested loop join is applicable, we prefer to broadcasting the table that is explicitly specified in a broadcast hint. For details, see the section [JDBC/ODBC](#broadcast-hint-for-sql-queries) and [SPARK-22489](https://issues.apache.org/jira/browse/SPARK-22489) for details.
```
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19858: [SPARK-22489][DOC][FOLLOWUP] Update broadcast behavior c...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19858
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/84383/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org