You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by Hyukjin Kwon <gu...@gmail.com> on 2022/03/28 01:22:53 UTC
[DISCUSS] Rename 'SQL' to 'SQL / DataFrame', and 'Query' to 'Execution' in SQL UI page
Hi all,
I have been investigating the improvements for Pandas API on Spark
specifically in UI.
I chatted with a couple of people, and decided to send an email here to
discuss more.
Currently, both SQL and DataFrame API are shown in “SQL” tab as below:
[image: Screen Shot 2022-03-25 at 12.18.14 PM.png]
which makes sense to developers because DataFrame API shares the same SQL
core but
I do believe this makes less sense to end users. Please consider two more
points:
- Spark ML users will run DataFrame-based MLlib API, but they will have
to check the "SQL" tab.
- Pandas API on Spark arguably has no link to SQL itself conceptually.
It makes less sense to users of pandas API.
So I would like to propose to rename:
- "SQL" to "SQL/DataFrame"
- "Query" to "Execution"
There's a PR open at https://github.com/apache/spark/pull/35973. Please
let me know your thoughts on this.
Thanks.
Re: [DISCUSS] Rename 'SQL' to 'SQL / DataFrame', and 'Query' to 'Execution' in SQL UI page
Posted by Nicholas Chammas <ni...@gmail.com>.
+1
Understanding the close relationship between SQL and DataFrames in Spark was a key learning moment for me, but I agree that using the terms interchangeably can be confusing.
> On Mar 27, 2022, at 9:27 PM, Hyukjin Kwon <gu...@gmail.com> wrote:
>
> *for some reason, the image looks broken (to me). I am attaching again to make sure.
>
> <Screen Shot 2022-03-25 at 12.18.14 PM.png>
>
> On Mon, 28 Mar 2022 at 10:22, Hyukjin Kwon <gurwls223@gmail.com <ma...@gmail.com>> wrote:
> Hi all,
>
> I have been investigating the improvements for Pandas API on Spark specifically in UI.
> I chatted with a couple of people, and decided to send an email here to discuss more.
>
> Currently, both SQL and DataFrame API are shown in “SQL” tab as below:
>
>
>
> which makes sense to developers because DataFrame API shares the same SQL core but
> I do believe this makes less sense to end users. Please consider two more points:
>
> Spark ML users will run DataFrame-based MLlib API, but they will have to check the "SQL" tab.
> Pandas API on Spark arguably has no link to SQL itself conceptually. It makes less sense to users of pandas API.
>
> So I would like to propose to rename:
> "SQL" to "SQL/DataFrame"
> "Query" to "Execution"
>
> There's a PR open at https://github.com/apache/spark/pull/35973 <https://github.com/apache/spark/pull/35973>. Please let me know your thoughts on this.
>
> Thanks.
Re: [DISCUSS] Rename 'SQL' to 'SQL / DataFrame', and 'Query' to 'Execution' in SQL UI page
Posted by Hyukjin Kwon <gu...@gmail.com>.
*for some reason, the image looks broken (to me). I am attaching again to
make sure.
[image: Screen Shot 2022-03-25 at 12.18.14 PM.png]
On Mon, 28 Mar 2022 at 10:22, Hyukjin Kwon <gu...@gmail.com> wrote:
> Hi all,
>
> I have been investigating the improvements for Pandas API on Spark
> specifically in UI.
> I chatted with a couple of people, and decided to send an email here to
> discuss more.
>
> Currently, both SQL and DataFrame API are shown in “SQL” tab as below:
>
> [image: Screen Shot 2022-03-25 at 12.18.14 PM.png]
>
> which makes sense to developers because DataFrame API shares the same SQL
> core but
> I do believe this makes less sense to end users. Please consider two more
> points:
>
> - Spark ML users will run DataFrame-based MLlib API, but they will
> have to check the "SQL" tab.
> - Pandas API on Spark arguably has no link to SQL itself conceptually.
> It makes less sense to users of pandas API.
>
>
> So I would like to propose to rename:
>
> - "SQL" to "SQL/DataFrame"
> - "Query" to "Execution"
>
>
> There's a PR open at https://github.com/apache/spark/pull/35973. Please
> let me know your thoughts on this.
>
> Thanks.
>