You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@doris.apache.org by 李航宇 <li...@flywheels.com> on 2023/02/13 03:23:22 UTC

[Discuss][DSIP] (Support high concurrent point query)

    Doris is built on a column oriented format engine, in hight concurrency serving scenario users always want to get a whole row from system. But column oriented format will massively amplify random read IO when table is wide.
Doris query engine and plan is too heavy for some simple queries like point query. We need a short fast path for such queries.FE is an access layer service for SQL queries and write in java, analyzing and parsing SQLs will lead very high CPU overhead for hight concurrency queries.
    To address these drawbacks, the following optimization methods can be applied:
1.  Row Store Format Optimization: In high concurrency serving scenarios, users often want to retrieve entire rows. To address the issue of high random read IO in wide tables, a row store format can be introduced in the system. This format stores data in a single row, making it easier to retrieve entire rows in a single read operation, reducing the number of disk accesses required and improving performance.
2. Short Path Optimization for Point Queries: The heavy query engine and plan in the system can lead to high overhead for simple point queries. To address this, a short path optimization can be implemented for point queries, bypassing the heavy query engine and using a fast and efficient path to directly retrieve the required data, improving performance.
3. Prepared Statement Optimization: High CPU overhead in high concurrency queries can be partly attributed to the CPU-intensive process of analyzing and parsing SQLs in the frontend (FE) layer. To address this, a prepared statement optimization can be implemented. A prepared statement is a precompiled SQL statement that can be executed multiple times, reducing the overhead of analyzing and parsing SQLs and improving performance.
In conclusion, these optimizations can help address the performance issues faced by Doris in high concurrency scenarios. By providing a row store format, implementing a short path optimization for point queries, and using prepared statements, Doris can deliver fast and efficient performance for high concurrency queries












李航宇
lihangyu@flywheels.com

签名由 网易灵犀办公 定制






Re: [Discuss][DSIP] (Support high concurrent point query)

Posted by Mingyu Chen <mo...@163.com>.
Hi Hangyu,
Thanks for your proposal.
I have created a new DSIP[1], and you can explain more in it.
Could you please tell me your wiki account name, I will grant write permission for you.


[1] https://cwiki.apache.org/confluence/display/DORIS/DSIP-031%3A+Row+store+for+hight+concurrency+serving+scenario




--

此致!Best Regards
陈明雨 Mingyu Chen

Email:
morningman@apache.org





At 2023-02-13 11:23:22, "李航宇" <li...@flywheels.com> wrote:
>    Doris is built on a column oriented format engine, in hight concurrency serving scenario users always want to get a whole row from system. But column oriented format will massively amplify random read IO when table is wide.
>Doris query engine and plan is too heavy for some simple queries like point query. We need a short fast path for such queries.FE is an access layer service for SQL queries and write in java, analyzing and parsing SQLs will lead very high CPU overhead for hight concurrency queries.
>    To address these drawbacks, the following optimization methods can be applied:
>1.  Row Store Format Optimization: In high concurrency serving scenarios, users often want to retrieve entire rows. To address the issue of high random read IO in wide tables, a row store format can be introduced in the system. This format stores data in a single row, making it easier to retrieve entire rows in a single read operation, reducing the number of disk accesses required and improving performance.
>2. Short Path Optimization for Point Queries: The heavy query engine and plan in the system can lead to high overhead for simple point queries. To address this, a short path optimization can be implemented for point queries, bypassing the heavy query engine and using a fast and efficient path to directly retrieve the required data, improving performance.
>3. Prepared Statement Optimization: High CPU overhead in high concurrency queries can be partly attributed to the CPU-intensive process of analyzing and parsing SQLs in the frontend (FE) layer. To address this, a prepared statement optimization can be implemented. A prepared statement is a precompiled SQL statement that can be executed multiple times, reducing the overhead of analyzing and parsing SQLs and improving performance.
>In conclusion, these optimizations can help address the performance issues faced by Doris in high concurrency scenarios. By providing a row store format, implementing a short path optimization for point queries, and using prepared statements, Doris can deliver fast and efficient performance for high concurrency queries
>
>
>
>
>
>
>
>
>
>
>
>
>李航宇
>lihangyu@flywheels.com
>
>签名由 网易灵犀办公 定制
>
>
>
>
>