You are viewing a plain text version of this content. The canonical link for it is here.
Posted to gitbox@hive.apache.org by GitBox <gi...@apache.org> on 2022/10/30 09:46:17 UTC

[GitHub] [hive] SelfImpr001 opened a new pull request, #3714: [WIP][HIVE-26649] [Metastore] Hive metabase performance issues due to slow…

SelfImpr001 opened a new pull request, #3714:
URL: https://github.com/apache/hive/pull/3714

   ### What changes were proposed in this pull request?
   When the Hive metabase uses Mysql, during the peak period of Hive statement query, the metastore initiates a large number of DirectSQL, which will cause performance problems in the metabase. The fundamental reason is that some DirectSQL performance problems cause a large number of slow queries at the DB level. This PR is mainly for optimization. Slow query problem at the metabase level
   
   ### Why are the changes needed?
   For example for the following Hive query:
   
   `
   select * from imd_fcac_safe.fcac_dw_loan_details where ds='2021-10-10' and sysid='MCFCM'
   `
   
   where ds and sysid are the primary and secondary partitions of the imd_fcac_safe.fcac_dw_loan_details table, respectively
   
   The Hive statement will generate the DirectSQL query as follows:
   
   `
   explain select PARTITIONS.PART_ID from PARTITIONS  inner join TBLS on PARTITIONS.TBL_ID = TBLS.TBL_ID     
   and TBLS.TBL_NAME = 'fcac_dw_loan_details'   inner join DBS on TBLS.DB_ID = DBS.DB_ID      and DBS.NAME = 'imd_fcac_safe' 
   inner join PARTITION_KEY_VALS FILTER0 on FILTER0.PART_ID = PARTITIONS.PART_ID and FILTER0.INTEGER_IDX = 0 
   inner join PARTITION_KEY_VALS FILTER1 on FILTER1.PART_ID = PARTITIONS.PART_ID and FILTER1.INTEGER_IDX = 1 
   where ( ((FILTER0.PART_KEY_VAL = '2021-10-10') and (FILTER1.PART_KEY_VAL = 'MCFCM')) )
   `
   ![image](https://user-images.githubusercontent.com/11420267/198872058-609b8a19-c3ba-4220-a7a8-138150d391fc.png)
   
   Problems with this statement
   There is no TBL_ID field in the PARTITION_KEY_VALS table, which will cause the partition of the same name of the unrelated table to be described when performing an associated query; there is no index column in the PARTITION_KEY_VAL table, so it cannot be accelerated by the index.
   
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   
   ### How was this patch tested?
   After compiling and packaging, execute the relevant partition filtering statement through the client. After repairing, the corresponding query corresponds to the DirectSQL execution plan as follows:
   
   ![image](https://user-images.githubusercontent.com/11420267/198872182-e4e69d49-75b1-45b8-9089-8ee4e1b8c27e.png)
   
   It can be seen that the amount of query data for partitioned tables has decreased. During the peak period of Hive tasks, the CPU load performance problem at the DB level has decreased.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] github-actions[bot] commented on pull request #3714: [WIP][HIVE-26649] [Metastore] Hive metabase performance issues due to slow…

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on PR #3714:
URL: https://github.com/apache/hive/pull/3714#issuecomment-1368574883

   This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the dev@hive.apache.org list if the patch is in need of reviews.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] github-actions[bot] closed pull request #3714: [WIP][HIVE-26649] [Metastore] Hive metabase performance issues due to slow…

Posted by GitBox <gi...@apache.org>.
github-actions[bot] closed pull request #3714: [WIP][HIVE-26649] [Metastore] Hive metabase performance issues due to slow…
URL: https://github.com/apache/hive/pull/3714


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] SelfImpr001 commented on pull request #3714: [WIP][HIVE-26649] [Metastore] Hive metabase performance issues due to slow…

Posted by GitBox <gi...@apache.org>.
SelfImpr001 commented on PR #3714:
URL: https://github.com/apache/hive/pull/3714#issuecomment-1299654405

   Can someone please review this pr


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] sonarcloud[bot] commented on pull request #3714: [WIP][HIVE-26649] [Metastore] Hive metabase performance issues due to slow…

Posted by GitBox <gi...@apache.org>.
sonarcloud[bot] commented on PR #3714:
URL: https://github.com/apache/hive/pull/3714#issuecomment-1296204288

   Kudos, SonarCloud Quality Gate passed!&nbsp; &nbsp; [![Quality Gate passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png 'Quality Gate passed')](https://sonarcloud.io/dashboard?id=apache_hive&pullRequest=3714)
   
   [![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png 'Bug')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3714&resolved=false&types=BUG) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3714&resolved=false&types=BUG) [0 Bugs](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3714&resolved=false&types=BUG)  
   [![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png 'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3714&resolved=false&types=VULNERABILITY) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3714&resolved=false&types=VULNERABILITY) [0 Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3714&resolved=false&types=VULNERABILITY)  
   [![Security Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png 'Security Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=3714&resolved=false&types=SECURITY_HOTSPOT) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=3714&resolved=false&types=SECURITY_HOTSPOT) [0 Security Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=3714&resolved=false&types=SECURITY_HOTSPOT)  
   [![Code Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png 'Code Smell')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3714&resolved=false&types=CODE_SMELL) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3714&resolved=false&types=CODE_SMELL) [0 Code Smells](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3714&resolved=false&types=CODE_SMELL)
   
   [![No Coverage information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png 'No Coverage information')](https://sonarcloud.io/component_measures?id=apache_hive&pullRequest=3714&metric=coverage&view=list) No Coverage information  
   [![No Duplication information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png 'No Duplication information')](https://sonarcloud.io/component_measures?id=apache_hive&pullRequest=3714&metric=duplicated_lines_density&view=list) No Duplication information
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org