You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Koert Kuipers <ko...@tresata.com> on 2017/04/04 20:10:55 UTC
how do i force unit test to do whole stage codegen
i wrote my own expression with eval and doGenCode, but doGenCode never gets
called in tests.
also as a test i ran this in a unit test:
spark.range(10).select('id as 'asId).where('id === 4).explain
according to
https://jaceklaskowski.gitbooks.io/mastering-apache-spark/spark-sql-whole-stage-codegen.html
this is supposed to show:
== Physical Plan ==WholeStageCodegen
: +- Project [id#0L AS asId#3L]
: +- Filter (id#0L = 4)
: +- Range 0, 1, 8, 10, [id#0L]
but it doesn't. instead it shows:
== Physical Plan ==
*Project [id#12L AS asId#15L]
+- *Filter (id#12L = 4)
+- *Range (0, 10, step=1, splits=Some(4))
so i am again missing the WholeStageCodegen. any idea why?
i create spark session for unit tests simply as:
val session = SparkSession.builder
.master("local[*]")
.appName("test")
.config("spark.sql.shuffle.partitions", 4)
.getOrCreate()
Re: how do i force unit test to do whole stage codegen
Posted by Koert Kuipers <ko...@tresata.com>.
got it. thats good to know. thanks!
On Wed, Apr 5, 2017 at 12:07 AM, Kazuaki Ishizaki <IS...@jp.ibm.com>
wrote:
> Hi,
> The page in the URL explains the old style of physical plan output.
> The current style adds "*" as a prefix of each operation that the
> whole-stage codegen can be apply to.
>
> So, in your test case, whole-stage codegen has been already enabled!!
>
> FYI. I think that it is a good topic for dev@spark.apache.org.
>
> Kazuaki Ishizaki
>
>
>
> From: Koert Kuipers <ko...@tresata.com>
> To: "user@spark.apache.org" <us...@spark.apache.org>
> Date: 2017/04/05 05:12
> Subject: how do i force unit test to do whole stage codegen
> ------------------------------
>
>
>
> i wrote my own expression with eval and doGenCode, but doGenCode never
> gets called in tests.
>
> also as a test i ran this in a unit test:
> spark.range(10).select('id as 'asId).where('id === 4).explain
> according to
>
> *https://jaceklaskowski.gitbooks.io/mastering-apache-spark/spark-sql-whole-stage-codegen.html*
> <https://jaceklaskowski.gitbooks.io/mastering-apache-spark/spark-sql-whole-stage-codegen.html>
> this is supposed to show:
> == Physical Plan ==
> WholeStageCodegen
> : +- Project [id#0L AS asId#3L]
> : +- Filter (id#0L = 4)
> : +- Range 0, 1, 8, 10, [id#0L]
>
> but it doesn't. instead it shows:
>
> == Physical Plan ==
> *Project [id#12L AS asId#15L]
> +- *Filter (id#12L = 4)
> +- *Range (0, 10, step=1, splits=Some(4))
>
> so i am again missing the WholeStageCodegen. any idea why?
>
> i create spark session for unit tests simply as:
> val session = SparkSession.builder
> .master("local[*]")
> .appName("test")
> .config("spark.sql.shuffle.partitions", 4)
> .getOrCreate()
>
>
>
Re: how do i force unit test to do whole stage codegen
Posted by Jacek Laskowski <ja...@japila.pl>.
Thanks Koert for the kind words. That part however is easy to fix and
was surprised to have seen the old style referenced (!)
Pozdrawiam,
Jacek Laskowski
----
https://medium.com/@jaceklaskowski/
Mastering Apache Spark 2 https://bit.ly/mastering-apache-spark
Follow me at https://twitter.com/jaceklaskowski
On Wed, Apr 5, 2017 at 6:14 PM, Koert Kuipers <ko...@tresata.com> wrote:
> its pretty much impossible to be fully up to date with spark given how fast
> it moves!
>
> the book is a very helpful reference
>
> On Wed, Apr 5, 2017 at 11:15 AM, Jacek Laskowski <ja...@japila.pl> wrote:
>>
>> Hi,
>>
>> I'm very sorry for not being up to date with the current style (and
>> "promoting" the old style) and am going to review that part soon. I'm very
>> close to touch it again since I'm with Optimizer these days.
>>
>> Jacek
>>
>> On 5 Apr 2017 6:08 a.m., "Kazuaki Ishizaki" <IS...@jp.ibm.com> wrote:
>>>
>>> Hi,
>>> The page in the URL explains the old style of physical plan output.
>>> The current style adds "*" as a prefix of each operation that the
>>> whole-stage codegen can be apply to.
>>>
>>> So, in your test case, whole-stage codegen has been already enabled!!
>>>
>>> FYI. I think that it is a good topic for dev@spark.apache.org.
>>>
>>> Kazuaki Ishizaki
>>>
>>>
>>>
>>> From: Koert Kuipers <ko...@tresata.com>
>>> To: "user@spark.apache.org" <us...@spark.apache.org>
>>> Date: 2017/04/05 05:12
>>> Subject: how do i force unit test to do whole stage codegen
>>> ________________________________
>>>
>>>
>>>
>>> i wrote my own expression with eval and doGenCode, but doGenCode never
>>> gets called in tests.
>>>
>>> also as a test i ran this in a unit test:
>>> spark.range(10).select('id as 'asId).where('id === 4).explain
>>> according to
>>>
>>> https://jaceklaskowski.gitbooks.io/mastering-apache-spark/spark-sql-whole-stage-codegen.html
>>> this is supposed to show:
>>> == Physical Plan ==
>>> WholeStageCodegen
>>> : +- Project [id#0L AS asId#3L]
>>> : +- Filter (id#0L = 4)
>>> : +- Range 0, 1, 8, 10, [id#0L]
>>>
>>> but it doesn't. instead it shows:
>>>
>>> == Physical Plan ==
>>> *Project [id#12L AS asId#15L]
>>> +- *Filter (id#12L = 4)
>>> +- *Range (0, 10, step=1, splits=Some(4))
>>>
>>> so i am again missing the WholeStageCodegen. any idea why?
>>>
>>> i create spark session for unit tests simply as:
>>> val session = SparkSession.builder
>>> .master("local[*]")
>>> .appName("test")
>>> .config("spark.sql.shuffle.partitions", 4)
>>> .getOrCreate()
>>>
>>>
>
---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org
Re: how do i force unit test to do whole stage codegen
Posted by Koert Kuipers <ko...@tresata.com>.
its pretty much impossible to be fully up to date with spark given how fast
it moves!
the book is a very helpful reference
On Wed, Apr 5, 2017 at 11:15 AM, Jacek Laskowski <ja...@japila.pl> wrote:
> Hi,
>
> I'm very sorry for not being up to date with the current style (and
> "promoting" the old style) and am going to review that part soon. I'm very
> close to touch it again since I'm with Optimizer these days.
>
> Jacek
>
> On 5 Apr 2017 6:08 a.m., "Kazuaki Ishizaki" <IS...@jp.ibm.com> wrote:
>
>> Hi,
>> The page in the URL explains the old style of physical plan output.
>> The current style adds "*" as a prefix of each operation that the
>> whole-stage codegen can be apply to.
>>
>> So, in your test case, whole-stage codegen has been already enabled!!
>>
>> FYI. I think that it is a good topic for dev@spark.apache.org.
>>
>> Kazuaki Ishizaki
>>
>>
>>
>> From: Koert Kuipers <ko...@tresata.com>
>> To: "user@spark.apache.org" <us...@spark.apache.org>
>> Date: 2017/04/05 05:12
>> Subject: how do i force unit test to do whole stage codegen
>> ------------------------------
>>
>>
>>
>> i wrote my own expression with eval and doGenCode, but doGenCode never
>> gets called in tests.
>>
>> also as a test i ran this in a unit test:
>> spark.range(10).select('id as 'asId).where('id === 4).explain
>> according to
>>
>> *https://jaceklaskowski.gitbooks.io/mastering-apache-spark/spark-sql-whole-stage-codegen.html*
>> <https://jaceklaskowski.gitbooks.io/mastering-apache-spark/spark-sql-whole-stage-codegen.html>
>> this is supposed to show:
>> == Physical Plan ==
>> WholeStageCodegen
>> : +- Project [id#0L AS asId#3L]
>> : +- Filter (id#0L = 4)
>> : +- Range 0, 1, 8, 10, [id#0L]
>>
>> but it doesn't. instead it shows:
>>
>> == Physical Plan ==
>> *Project [id#12L AS asId#15L]
>> +- *Filter (id#12L = 4)
>> +- *Range (0, 10, step=1, splits=Some(4))
>>
>> so i am again missing the WholeStageCodegen. any idea why?
>>
>> i create spark session for unit tests simply as:
>> val session = SparkSession.builder
>> .master("local[*]")
>> .appName("test")
>> .config("spark.sql.shuffle.partitions", 4)
>> .getOrCreate()
>>
>>
>>
Re: how do i force unit test to do whole stage codegen
Posted by Jacek Laskowski <ja...@japila.pl>.
Hi,
I'm very sorry for not being up to date with the current style (and
"promoting" the old style) and am going to review that part soon. I'm very
close to touch it again since I'm with Optimizer these days.
Jacek
On 5 Apr 2017 6:08 a.m., "Kazuaki Ishizaki" <IS...@jp.ibm.com> wrote:
> Hi,
> The page in the URL explains the old style of physical plan output.
> The current style adds "*" as a prefix of each operation that the
> whole-stage codegen can be apply to.
>
> So, in your test case, whole-stage codegen has been already enabled!!
>
> FYI. I think that it is a good topic for dev@spark.apache.org.
>
> Kazuaki Ishizaki
>
>
>
> From: Koert Kuipers <ko...@tresata.com>
> To: "user@spark.apache.org" <us...@spark.apache.org>
> Date: 2017/04/05 05:12
> Subject: how do i force unit test to do whole stage codegen
> ------------------------------
>
>
>
> i wrote my own expression with eval and doGenCode, but doGenCode never
> gets called in tests.
>
> also as a test i ran this in a unit test:
> spark.range(10).select('id as 'asId).where('id === 4).explain
> according to
>
> *https://jaceklaskowski.gitbooks.io/mastering-apache-spark/spark-sql-whole-stage-codegen.html*
> <https://jaceklaskowski.gitbooks.io/mastering-apache-spark/spark-sql-whole-stage-codegen.html>
> this is supposed to show:
> == Physical Plan ==
> WholeStageCodegen
> : +- Project [id#0L AS asId#3L]
> : +- Filter (id#0L = 4)
> : +- Range 0, 1, 8, 10, [id#0L]
>
> but it doesn't. instead it shows:
>
> == Physical Plan ==
> *Project [id#12L AS asId#15L]
> +- *Filter (id#12L = 4)
> +- *Range (0, 10, step=1, splits=Some(4))
>
> so i am again missing the WholeStageCodegen. any idea why?
>
> i create spark session for unit tests simply as:
> val session = SparkSession.builder
> .master("local[*]")
> .appName("test")
> .config("spark.sql.shuffle.partitions", 4)
> .getOrCreate()
>
>
>
Re: how do i force unit test to do whole stage codegen
Posted by Kazuaki Ishizaki <IS...@jp.ibm.com>.
Hi,
The page in the URL explains the old style of physical plan output.
The current style adds "*" as a prefix of each operation that the
whole-stage codegen can be apply to.
So, in your test case, whole-stage codegen has been already enabled!!
FYI. I think that it is a good topic for dev@spark.apache.org.
Kazuaki Ishizaki
From: Koert Kuipers <ko...@tresata.com>
To: "user@spark.apache.org" <us...@spark.apache.org>
Date: 2017/04/05 05:12
Subject: how do i force unit test to do whole stage codegen
i wrote my own expression with eval and doGenCode, but doGenCode never
gets called in tests.
also as a test i ran this in a unit test:
spark.range(10).select('id as 'asId).where('id === 4).explain
according to
https://jaceklaskowski.gitbooks.io/mastering-apache-spark/spark-sql-whole-stage-codegen.html
this is supposed to show:
== Physical Plan ==
WholeStageCodegen
: +- Project [id#0L AS asId#3L]
: +- Filter (id#0L = 4)
: +- Range 0, 1, 8, 10, [id#0L]
but it doesn't. instead it shows:
== Physical Plan ==
*Project [id#12L AS asId#15L]
+- *Filter (id#12L = 4)
+- *Range (0, 10, step=1, splits=Some(4))
so i am again missing the WholeStageCodegen. any idea why?
i create spark session for unit tests simply as:
val session = SparkSession.builder
.master("local[*]")
.appName("test")
.config("spark.sql.shuffle.partitions", 4)
.getOrCreate()