You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2021/01/15 15:06:00 UTC

[jira] [Work logged] (HIVE-24644) QueryResultCache parses the query twice

     [ https://issues.apache.org/jira/browse/HIVE-24644?focusedWorklogId=536535&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-536535 ]

ASF GitHub Bot logged work on HIVE-24644:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 15/Jan/21 15:05
            Start Date: 15/Jan/21 15:05
    Worklog Time Spent: 10m 
      Work Description: kasakrisz opened a new pull request #1874:
URL: https://github.com/apache/hive/pull/1874


   ### What changes were proposed in this pull request?
   Query results cache requires query text having fully qualified table names as cache key. By the time query compilation reach the point where results cache key is generated unparseTranslator instance has the fully qualified table names. Use this to generate cache key.
   Generating the key from query text also requires the TokenRewriteStream instance related to the parsed query. Applying transformations stored in the unparseTranslator would alter the TokenRewriteStream and makes invalid for further usage. In order to avoid this a dedicated TokenRewriteStream program is introduced for Query results cache.
   
   ### Why are the changes needed?
   All query was parsed twice:
   * first parse to have the AST tree for compilation
   * second parse to generate cache key from query text having fully qualified table names.
   
   ### Does this PR introduce _any_ user-facing change?
   No.
   
   ### How was this patch tested?
   Run existing qtest about Results cache:
   ```
   mvn test -Dtest.output.overwrite -DskipSparkTests -Dtest=TestMiniLlapLocalCliDriver -Dqfile=results_cache_invalidation2.q,results_cache_with_masking.q,results_cache_lifetime.q,results_cache_temptable.q,results_cache_with_auth.q,results_cache_3.q,results_cache_1.q,results_cache_empty_result.q,results_cache_capacity.q,results_cache_diff_fs.q,results_cache_2.q,results_cache_truncate.q,results_cache_quoted_identifiers.q,results_cache_transactional.q,results_cache_invalidation.q -pl itests/qtest -Pitests
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Issue Time Tracking
-------------------

            Worklog Id:     (was: 536535)
    Remaining Estimate: 0h
            Time Spent: 10m

> QueryResultCache parses the query twice
> ---------------------------------------
>
>                 Key: HIVE-24644
>                 URL: https://issues.apache.org/jira/browse/HIVE-24644
>             Project: Hive
>          Issue Type: Improvement
>          Components: HiveServer2, Parser
>            Reporter: Krisztian Kasa
>            Assignee: Krisztian Kasa
>            Priority: Minor
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> Query result cache lookup results by query text which has fully resolved table references.
> In order to generate this query text currently implementation 
> * transforms the AST tree back to String
> * parses the String generated in above step
> * traverse the new AST and replaces the table references to the fully qualified ones
> * transforms the new AST tree back to String -> this will be the cache key



--
This message was sent by Atlassian Jira
(v8.3.4#803005)