You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by Hyukjin Kwon <gu...@gmail.com> on 2022/03/28 01:22:53 UTC

[DISCUSS] Rename 'SQL' to 'SQL / DataFrame', and 'Query' to 'Execution' in SQL UI page

Hi all,

I have been investigating the improvements for Pandas API on Spark
specifically in UI.
I chatted with a couple of people, and decided to send an email here to
discuss more.

Currently, both SQL and DataFrame API are shown in “SQL” tab as below:

[image: Screen Shot 2022-03-25 at 12.18.14 PM.png]

which makes sense to developers because DataFrame API shares the same SQL
core but
I do believe this makes less sense to end users. Please consider two more
points:

   - Spark ML users will run DataFrame-based MLlib API, but they will have
   to check the "SQL" tab.
   - Pandas API on Spark arguably has no link to SQL itself conceptually.
   It makes less sense to users of pandas API.


So I would like to propose to rename:

   - "SQL" to "SQL/DataFrame"
   - "Query" to "Execution"


There's a PR open at https://github.com/apache/spark/pull/35973. Please
let me know your thoughts on this.

Thanks.

Re: [DISCUSS] Rename 'SQL' to 'SQL / DataFrame', and 'Query' to 'Execution' in SQL UI page

Posted by Nicholas Chammas <ni...@gmail.com>.
+1

Understanding the close relationship between SQL and DataFrames in Spark was a key learning moment for me, but I agree that using the terms interchangeably can be confusing.


> On Mar 27, 2022, at 9:27 PM, Hyukjin Kwon <gu...@gmail.com> wrote:
> 
> *for some reason, the image looks broken (to me). I am attaching again to make sure.
> 
> <Screen Shot 2022-03-25 at 12.18.14 PM.png>
> 
> On Mon, 28 Mar 2022 at 10:22, Hyukjin Kwon <gurwls223@gmail.com <ma...@gmail.com>> wrote:
> Hi all,
> 
> I have been investigating the improvements for Pandas API on Spark specifically in UI.
> I chatted with a couple of people, and decided to send an email here to discuss more.
> 
> Currently, both SQL and DataFrame API are shown in “SQL” tab as below:
> 
> 
> 
> which makes sense to developers because DataFrame API shares the same SQL core but
> I do believe this makes less sense to end users. Please consider two more points:
> 
> Spark ML users will run DataFrame-based MLlib API, but they will have to check the "SQL" tab.
> Pandas API on Spark arguably has no link to SQL itself conceptually. It makes less sense to users of pandas API.
> 
> So I would like to propose to rename:
> "SQL" to "SQL/DataFrame"
> "Query" to "Execution"
> 
> There's a PR open at https://github.com/apache/spark/pull/35973 <https://github.com/apache/spark/pull/35973>. Please let me know your thoughts on this. 
> 
> Thanks.


Re: [DISCUSS] Rename 'SQL' to 'SQL / DataFrame', and 'Query' to 'Execution' in SQL UI page

Posted by Hyukjin Kwon <gu...@gmail.com>.
*for some reason, the image looks broken (to me). I am attaching again to
make sure.

[image: Screen Shot 2022-03-25 at 12.18.14 PM.png]

On Mon, 28 Mar 2022 at 10:22, Hyukjin Kwon <gu...@gmail.com> wrote:

> Hi all,
>
> I have been investigating the improvements for Pandas API on Spark
> specifically in UI.
> I chatted with a couple of people, and decided to send an email here to
> discuss more.
>
> Currently, both SQL and DataFrame API are shown in “SQL” tab as below:
>
> [image: Screen Shot 2022-03-25 at 12.18.14 PM.png]
>
> which makes sense to developers because DataFrame API shares the same SQL
> core but
> I do believe this makes less sense to end users. Please consider two more
> points:
>
>    - Spark ML users will run DataFrame-based MLlib API, but they will
>    have to check the "SQL" tab.
>    - Pandas API on Spark arguably has no link to SQL itself conceptually.
>    It makes less sense to users of pandas API.
>
>
> So I would like to propose to rename:
>
>    - "SQL" to "SQL/DataFrame"
>    - "Query" to "Execution"
>
>
> There's a PR open at https://github.com/apache/spark/pull/35973. Please
> let me know your thoughts on this.
>
> Thanks.
>