You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Zhang Jianguo (Jira)" <ji...@apache.org> on 2021/03/04 08:49:00 UTC

[jira] [Updated] (SPARK-34616) SessionCatalog#requireDbExists is called too many times affects SQL performance

     [ https://issues.apache.org/jira/browse/SPARK-34616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Zhang Jianguo updated SPARK-34616:
----------------------------------
    Description: 
Almost all SQL need to check if database exist before execution.

Every check needs to connect to Metastore and check if the DB exists in Databse. This behavior could be too expensive.

 

A proposal is to add a switch to control if it's necessary to check DB's existance.

 

[requireDbExists]([https://github.com/apache/spark/blob/53e4dba7c489ac5c0ad61f0121c4e247de5b485c/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala#L197)]

 

!db_existance.PNG!

 

  was:
Almost all SQL need to check if database exist before execution.

Every check needs to connect to Metastore and check if the DB exists in Databse. This behavior could be too expensive.

 

A proposal is to add a switch to control if it's necessary to check DB's existance.

 

[requireDbExists]([https://github.com/apache/spark/blob/53e4dba7c489ac5c0ad61f0121c4e247de5b485c/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala#L197)]

 


> SessionCatalog#requireDbExists is called too many times affects SQL performance
> -------------------------------------------------------------------------------
>
>                 Key: SPARK-34616
>                 URL: https://issues.apache.org/jira/browse/SPARK-34616
>             Project: Spark
>          Issue Type: Improvement
>          Components: Spark Core
>    Affects Versions: 2.4.5, 3.0.0
>            Reporter: Zhang Jianguo
>            Priority: Minor
>         Attachments: db_existance.PNG
>
>
> Almost all SQL need to check if database exist before execution.
> Every check needs to connect to Metastore and check if the DB exists in Databse. This behavior could be too expensive.
>  
> A proposal is to add a switch to control if it's necessary to check DB's existance.
>  
> [requireDbExists]([https://github.com/apache/spark/blob/53e4dba7c489ac5c0ad61f0121c4e247de5b485c/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala#L197)]
>  
> !db_existance.PNG!
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org