You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@doris.apache.org by GitBox <gi...@apache.org> on 2020/02/06 07:15:52 UTC

[GitHub] [incubator-doris] kangkaisen opened a new issue #2846: [Proposal] Support in memory olap table in Segment V2

kangkaisen opened a new issue #2846: [Proposal] Support in memory olap table in Segment V2
URL: https://github.com/apache/incubator-doris/issues/2846
 
 
   ## 1 Why need in memory olap table
   Currently, Disk seek is still a bottleneck for most of Doris queries, So we could use in memory table to speed up Doris queries like other Database (HBase, Arrow, ClickHouse ...)
   
   ## 2 How to implement in memory olap table
   As for as I know, there should be two ways to implement in memory table
   
   ### 2.1 Cache disk data by memory
   Like HBase, HBase implements in memory table by  `BlockCache`, we could refer to http://hbase.apache.org/book.html#block.cache.design   and https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java
   
   ### 2.2 Design special memory layout and data struct for memory
   Like ClickHouse  https://clickhouse.tech/docs/en/operations/table_engines/memory/
   Apache Arrow  http://arrow.apache.org/docs/cpp/api/table.html#tables
   SnappyData   https://blog.bcmeng.com/post/snappydata.html
   Memsql  https://www.memsql.com/blog/what-is-skiplist-why-skiplist-index-for-memsql/
   
   **Because Doris has implemented page cache, So I decide to implement in memory table base on page cache in Doris.** Of course, We could implement special in memory table engine in the future,
   which two solutions are not conflicting.
   
   ## 3 Detailed Design
   1 introduce `CachePriority` to `LRUCache`.  The entry with smaller CachePriority In `LRUCache` will evict firstly. Currently CachePriority has two value, `DURABLE` for in memory table, `NORMAL` for normal table.
   When `_evict_from_lru`, we will firstly evict all cache entries with `NORMAL` priority, and finally evict cache entries with `DURABLE` priority.
   2 Add a `in_memory` property to `OlapTable`
   3 Add a `is_in_memory` field to `TabletSchema`
   4 Add a `cache_in_memory`  field to `ColumnReaderOptions`
   
   
   
   
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] kangkaisen commented on issue #2846: [Proposal] Support in memory olap table in Segment V2

Posted by GitBox <gi...@apache.org>.
kangkaisen commented on issue #2846: [Proposal] Support in memory olap table in Segment V2
URL: https://github.com/apache/incubator-doris/issues/2846#issuecomment-583208333
 
 
   @imay 
   I will do a benchmark.
   
   OK, I will add `in_memory` option to partition property. 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] imay commented on issue #2846: [Proposal] Support in memory olap table in Segment V2

Posted by GitBox <gi...@apache.org>.
imay commented on issue #2846: [Proposal] Support in memory olap table in Segment V2
URL: https://github.com/apache/incubator-doris/issues/2846#issuecomment-583197900
 
 
   Hi @kangkaisen 
   
   Good job!
   
   Do you have some benchmark for this improvement? For example when all the table is cached in memory, how about the performance of the query.
   
   And I think the `in_memory` option is better to be as a property of a partition. If we do that way, we can make the most visited partition in memory and others in SSD or HDD. This will maximize the use of memory.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] kangkaisen closed issue #2846: [Proposal] Support in memory olap table in Segment V2

Posted by GitBox <gi...@apache.org>.
kangkaisen closed issue #2846: [Proposal] Support in memory olap table in Segment V2
URL: https://github.com/apache/incubator-doris/issues/2846
 
 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org