You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@community.apache.org by "Zhijing Lu (Jira)" <ji...@apache.org> on 2023/03/13 04:50:00 UTC
[jira] [Updated] (COMDEV-510) [GSoC][Doris]Page Cache Improvement
[ https://issues.apache.org/jira/browse/COMDEV-510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Zhijing Lu updated COMDEV-510:
------------------------------
Description:
*Apache Doris*
Apache Doris is a real-time analytical database based on MPP architecture. As a unified platform that supports multiple data processing scenarios, it ensures high performance for low-latency and high-throughput queries, allows for easy federated queries on data lakes, and supports various data ingestion methods.
{*}Page{*}: [https://doris.apache.org|https://doris.apache.org/]
{*}Github{*}: [https://github.com/apache/doris]
h3. *Background*
Apache Doris accelerates high-concurrency queries utilizing page cache, where the decompressed data is stored.
Currently, the page cache in Apache Doris uses a simple LRU algorithm, which reveals a few problems:
* Hot data will be phased out in large queries
* The page cache configuration is immutable and does not support GC.
h3. Task
* {*}Phase One{*}: Identify the impacts on queries when the decompressed data is stored in memory and SSD, respectively, and then determine whether full page cache is required.
* {*}Phase Two{*}: Improve the cache strategy for Apache Doris based on the results from Phase One.
h3. Learning Material
{*}Page{*}: [https://doris.apache.org|https://doris.apache.org/]
{*}Github{*}: [https://github.com/apache/doris]
h3. Mentor
* Mentor: Yongqiang Yang, Apache Doris PMC member & Committer, [yangyongqiang@apache.org |mailto:yangyongqiang@apache.org]
* Mentor: Haopeng Li, Apache Doris PMC member & Committer, [lihaopeng@apache.org|mailto:lihaopeng@apache.org]
* Mailing List: dev@doris.apache.org
was:
*Apache Doris*
Apache Doris is a real-time analytical database based on MPP architecture. As a unified platform that supports multiple data processing scenarios, it ensures high performance for low-latency and high-throughput queries, allows for easy federated queries on data lakes, and supports various data ingestion methods.
{*}Page{*}: [https://doris.apache.org|https://doris.apache.org/]
{*}Github{*}: [https://github.com/apache/doris]
h3. *Background*
Apache Doris accelerates high-concurrency queries utilizing page cache, where the decompressed data is stored.
Currently, the page cache in Apache Doris uses a simple LRU algorithm, which reveals a few problems: #
Hot data will be phased out in large queries
#
The page cache configuration is immutable and does not support GC.
h3. Task
#
{*}Phase One{*}: Identify the impacts on queries when the decompressed data is stored in memory and SSD, respectively, and then determine whether full page cache is required.
#
{*}Phase Two{*}: Improve the cache strategy for Apache Doris based on the results from Phase One.
h3. Learning Material
{*}Page{*}: https://doris.apache.org
{*}Github{*}: [https://github.com/apache/doris]
h3. Mentor
* Mentor: Yongqiang Yang, Apache Doris PMC member & Committer, [yangyongqiang@apache.org |mailto:yangyongqiang@apache.org]
* Mentor: Haopeng Li, Apache Doris PMC member & Committer, [lihaopeng@apache.org|mailto:lihaopeng@apache.org]
* Mailing List: dev@doris.apache.org
> [GSoC][Doris]Page Cache Improvement
> -----------------------------------
>
> Key: COMDEV-510
> URL: https://issues.apache.org/jira/browse/COMDEV-510
> Project: Community Development
> Issue Type: Task
> Components: GSoC/Mentoring ideas
> Reporter: Zhijing Lu
> Priority: Major
> Labels: ApacheDoris, Mentor, full-time, gsoc2023
>
> *Apache Doris*
> Apache Doris is a real-time analytical database based on MPP architecture. As a unified platform that supports multiple data processing scenarios, it ensures high performance for low-latency and high-throughput queries, allows for easy federated queries on data lakes, and supports various data ingestion methods.
> {*}Page{*}: [https://doris.apache.org|https://doris.apache.org/]
> {*}Github{*}: [https://github.com/apache/doris]
> h3. *Background*
> Apache Doris accelerates high-concurrency queries utilizing page cache, where the decompressed data is stored.
> Currently, the page cache in Apache Doris uses a simple LRU algorithm, which reveals a few problems:
> * Hot data will be phased out in large queries
> * The page cache configuration is immutable and does not support GC.
> h3. Task
> * {*}Phase One{*}: Identify the impacts on queries when the decompressed data is stored in memory and SSD, respectively, and then determine whether full page cache is required.
> * {*}Phase Two{*}: Improve the cache strategy for Apache Doris based on the results from Phase One.
> h3. Learning Material
> {*}Page{*}: [https://doris.apache.org|https://doris.apache.org/]
> {*}Github{*}: [https://github.com/apache/doris]
> h3. Mentor
> * Mentor: Yongqiang Yang, Apache Doris PMC member & Committer, [yangyongqiang@apache.org |mailto:yangyongqiang@apache.org]
> * Mentor: Haopeng Li, Apache Doris PMC member & Committer, [lihaopeng@apache.org|mailto:lihaopeng@apache.org]
> * Mailing List: dev@doris.apache.org
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@community.apache.org
For additional commands, e-mail: dev-help@community.apache.org