You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by Jacek Laskowski <ja...@japila.pl> on 2020/02/15 18:44:17 UTC

[DOCS] Spark SQL Upgrading Guide

Hi,

Just noticed that
http://spark.apache.org/docs/latest/sql-migration-guide-upgrade.html (Spark
2.4.5) has formatting issues in "Upgrading from Spark SQL 2.4.3 to 2.4.4"
[1] which got fixed in master [2]. That's OK.

What made me wonder was the other change to the section "Upgrading from
Spark SQL 2.4 to 2.4.5" [3] that had the following item included:

"Starting from 2.4.5, SQL configurations are effective also when a Dataset
is converted to an RDD and its plan is executed due to action on the
derived RDD. The previous behavior can be restored setting
spark.sql.legacy.rdd.applyConf to false: in this case, SQL configurations
are ignored for operations performed on a RDD derived from a Dataset."

Why was this removed in master [4]? It was mentioned in "Notable changes"
of Spark Release 2.4.5 [5].

[1]
http://spark.apache.org/docs/latest/sql-migration-guide-upgrade.html#upgrading-from-spark-sql-243-to-244
[2]
https://github.com/apache/spark/blob/master/docs/sql-migration-guide.md#upgrading-from-spark-sql-243-to-244
[3]
http://spark.apache.org/docs/latest/sql-migration-guide-upgrade.html#upgrading-from-spark-sql-24-to-245
[4]
https://github.com/apache/spark/blob/master/docs/sql-migration-guide.md#upgrading-from-spark-sql-244-to-245
[5] http://spark.apache.org/releases/spark-release-2-4-5.html

Pozdrawiam,
Jacek Laskowski
----
https://about.me/JacekLaskowski
"The Internals Of" Online Books <https://books.japila.pl/>
Follow me on https://twitter.com/jaceklaskowski

<https://twitter.com/jaceklaskowski>

Re: [DOCS] Spark SQL Upgrading Guide

Posted by Hyukjin Kwon <gu...@gmail.com>.
Thanks for checking it, Jacek.

2020년 2월 16일 (일) 오후 7:23, Jacek Laskowski <ja...@japila.pl>님이 작성:

> Hi,
>
> Never mind. Found this [1]:
>
> > This config is deprecated and it will be removed in 3.0.0.
>
> And so it has :) Thanks and sorry for the trouble.
>
> [1]
> https://github.com/apache/spark/blob/830a4ec59b86253f18eb7dfd6ed0bbe0d7920e5b/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala#L1306-L1307
>
> Pozdrawiam,
> Jacek Laskowski
> ----
> https://about.me/JacekLaskowski
> "The Internals Of" Online Books <https://books.japila.pl/>
> Follow me on https://twitter.com/jaceklaskowski
>
> <https://twitter.com/jaceklaskowski>
>
>
> On Sat, Feb 15, 2020 at 7:44 PM Jacek Laskowski <ja...@japila.pl> wrote:
>
>> Hi,
>>
>> Just noticed that
>> http://spark.apache.org/docs/latest/sql-migration-guide-upgrade.html (Spark
>> 2.4.5) has formatting issues in "Upgrading from Spark SQL 2.4.3 to 2.4.4"
>> [1] which got fixed in master [2]. That's OK.
>>
>> What made me wonder was the other change to the section "Upgrading from
>> Spark SQL 2.4 to 2.4.5" [3] that had the following item included:
>>
>> "Starting from 2.4.5, SQL configurations are effective also when a
>> Dataset is converted to an RDD and its plan is executed due to action on
>> the derived RDD. The previous behavior can be restored setting
>> spark.sql.legacy.rdd.applyConf to false: in this case, SQL configurations
>> are ignored for operations performed on a RDD derived from a Dataset."
>>
>> Why was this removed in master [4]? It was mentioned in "Notable changes"
>> of Spark Release 2.4.5 [5].
>>
>> [1]
>> http://spark.apache.org/docs/latest/sql-migration-guide-upgrade.html#upgrading-from-spark-sql-243-to-244
>> [2]
>> https://github.com/apache/spark/blob/master/docs/sql-migration-guide.md#upgrading-from-spark-sql-243-to-244
>> [3]
>> http://spark.apache.org/docs/latest/sql-migration-guide-upgrade.html#upgrading-from-spark-sql-24-to-245
>> [4]
>> https://github.com/apache/spark/blob/master/docs/sql-migration-guide.md#upgrading-from-spark-sql-244-to-245
>> [5] http://spark.apache.org/releases/spark-release-2-4-5.html
>>
>> Pozdrawiam,
>> Jacek Laskowski
>> ----
>> https://about.me/JacekLaskowski
>> "The Internals Of" Online Books <https://books.japila.pl/>
>> Follow me on https://twitter.com/jaceklaskowski
>>
>> <https://twitter.com/jaceklaskowski>
>>
>

Re: [DOCS] Spark SQL Upgrading Guide

Posted by Jacek Laskowski <ja...@japila.pl>.
Hi,

Never mind. Found this [1]:

> This config is deprecated and it will be removed in 3.0.0.

And so it has :) Thanks and sorry for the trouble.

[1]
https://github.com/apache/spark/blob/830a4ec59b86253f18eb7dfd6ed0bbe0d7920e5b/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala#L1306-L1307

Pozdrawiam,
Jacek Laskowski
----
https://about.me/JacekLaskowski
"The Internals Of" Online Books <https://books.japila.pl/>
Follow me on https://twitter.com/jaceklaskowski

<https://twitter.com/jaceklaskowski>


On Sat, Feb 15, 2020 at 7:44 PM Jacek Laskowski <ja...@japila.pl> wrote:

> Hi,
>
> Just noticed that
> http://spark.apache.org/docs/latest/sql-migration-guide-upgrade.html (Spark
> 2.4.5) has formatting issues in "Upgrading from Spark SQL 2.4.3 to 2.4.4"
> [1] which got fixed in master [2]. That's OK.
>
> What made me wonder was the other change to the section "Upgrading from
> Spark SQL 2.4 to 2.4.5" [3] that had the following item included:
>
> "Starting from 2.4.5, SQL configurations are effective also when a Dataset
> is converted to an RDD and its plan is executed due to action on the
> derived RDD. The previous behavior can be restored setting
> spark.sql.legacy.rdd.applyConf to false: in this case, SQL configurations
> are ignored for operations performed on a RDD derived from a Dataset."
>
> Why was this removed in master [4]? It was mentioned in "Notable changes"
> of Spark Release 2.4.5 [5].
>
> [1]
> http://spark.apache.org/docs/latest/sql-migration-guide-upgrade.html#upgrading-from-spark-sql-243-to-244
> [2]
> https://github.com/apache/spark/blob/master/docs/sql-migration-guide.md#upgrading-from-spark-sql-243-to-244
> [3]
> http://spark.apache.org/docs/latest/sql-migration-guide-upgrade.html#upgrading-from-spark-sql-24-to-245
> [4]
> https://github.com/apache/spark/blob/master/docs/sql-migration-guide.md#upgrading-from-spark-sql-244-to-245
> [5] http://spark.apache.org/releases/spark-release-2-4-5.html
>
> Pozdrawiam,
> Jacek Laskowski
> ----
> https://about.me/JacekLaskowski
> "The Internals Of" Online Books <https://books.japila.pl/>
> Follow me on https://twitter.com/jaceklaskowski
>
> <https://twitter.com/jaceklaskowski>
>