You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@doris.apache.org by GitBox <gi...@apache.org> on 2021/11/26 12:05:18 UTC

[GitHub] [incubator-doris] zenoyang opened a new issue #7230: [Optimize] Improve Sql Cache hit rate

zenoyang opened a new issue #7230:
URL: https://github.com/apache/incubator-doris/issues/7230


   ### Search before asking
   
   - [X] I had searched in the [issues](https://github.com/apache/incubator-doris/issues?q=is%3Aissue) and found no similar issues.
   
   
   ### Description
   
   Currently, when checking the cache mode, the latest update time of all tables to be scanned will be obtained. If the update time is greater than cache_last_version_interval_second, the cache mode is SqlCache, and if it is less than cache_last_version_interval_second, the cache mode is PartitionCache.
   
   
   The granularity of obtaining table update time is at the table level (compared to comparing the update time of all partitions of a table). If the granularity of the partition can be refined (only the partition of this query is compared), the hit rate of SqlCache can be improved. Use SqlCache as much as possible, because Partition has too many restrictions, many cases will be passed, and the probability of being hit is small.
   
   
   For example:
   
   For a table tbl1, there are the following partitions:
   
   partition | updateTime
   -- | --
   20200111 | 2020-01-15 1:00:00
   20200112 | 2020-01-12 1:00:00
   20200113 | 2020-01-13 1:00:00
   20200114 | 2020-01-13 1:00:00
   20200115 | 2020-01-15 1:00:00
   
   
   Assuming that the current time is 2020-01-15 1:00:01,
   
   ```sql
   SELECT * FROM tbl1 WHERE dt>="2020-01-12" and dt<="2020-01-14";
   ```
   
   At this time, since the latest update time of tbl1 is 2020-01-15 1:00:00, SqlCache cannot be used at this time. And try PartitionCache. Although the update time of the partitions in this query is relatively long.
   
   expect:
   ```sql
   SELECT * FROM tbl1 WHERE dt>="2020-01-12" and dt<="2020-01-14";
   ```
   SqlCache can be hit.
   
   This optimization is very beneficial to the following scenarios:
   - When re-leading historical data, you can hit SqlCache when querying newer data;
   - In real-time scenarios (usually only the latest partition data is frequently updated), querying historical data can hit SqlCache;
   
   ### Use case
   
   _No response_
   
   ### Related issues
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] zenoyang closed issue #7230: [Optimize] Improve Sql Cache hit rate

Posted by GitBox <gi...@apache.org>.
zenoyang closed issue #7230:
URL: https://github.com/apache/incubator-doris/issues/7230


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org