You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kvrocks.apache.org by GitBox <gi...@apache.org> on 2022/09/25 01:27:31 UTC

[GitHub] [incubator-kvrocks] xiaobiaozhao opened a new issue, #918: support Multi-disk(Multi-path)

xiaobiaozhao opened a new issue, #918:
URL: https://github.com/apache/incubator-kvrocks/issues/918

   ### Search before asking
   
   - [X] I had searched in the [issues](https://github.com/apache/incubator-kvrocks/issues) and found no similar issues.
   
   
   ### Motivation
   
   When the host has multiple disks, multiple disks can be used for data storage to increase the performance of KVROCKS.
   Hot data can be stored on local SSDS and cold data can be stored on cloud disks
   
   ### Solution
   
   ```
   option.db_paths = {
                        {"/disk1", 1000 * 1000 * 1000},
                        {"/disk2", 1000 * 1000 * 1000},
                        {"/disk3", 1000 * 1000 * 1000},
                        {"/disk4", 1000 * 1000 * 1000}};
   ```
   
   https://github.com/facebook/rocksdb/blob/main/include/rocksdb/options.h#L672
   
   ### Are you willing to submit a PR?
   
   - [X] I'm willing to submit a PR!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@kvrocks.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-kvrocks] jishengming1 commented on issue #918: support Multi-disk(Multi-path)

Posted by "jishengming1 (via GitHub)" <gi...@apache.org>.
jishengming1 commented on issue #918:
URL: https://github.com/apache/incubator-kvrocks/issues/918#issuecomment-1409934665

   > > Hi @xiaobiaozhao , I have a questions: I get two ssd , Is there a way to split the hot data?
   > 
   > https://github.com/apache/incubator-kvrocks/pull/953/files#diff-e29cedc586b39d07b64be1df007101989a20f7d4452fc18fe23136f6d4ccd331R792
   > 
   > "/mnt/ssd 10G; /mnt/hdd 1T;" hot data cool data
   
   I am not trying to distinguish between hot data and cool data. I mean to make the pressure of both disks the same. 
   Hot data is distributed in two disks. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@kvrocks.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-kvrocks] marlboroman81 commented on issue #918: support Multi-disk(Multi-path)

Posted by "marlboroman81 (via GitHub)" <gi...@apache.org>.
marlboroman81 commented on issue #918:
URL: https://github.com/apache/incubator-kvrocks/issues/918#issuecomment-1414030549

   > I get two ssd , Is there a way to split the hot data?
   
   You can make a raid0 from several disks and place the datadir on it. Or you can use zfs pool consisting of several disks.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@kvrocks.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-kvrocks] xiaobiaozhao commented on issue #918: support Multi-disk(Multi-path)

Posted by "xiaobiaozhao (via GitHub)" <gi...@apache.org>.
xiaobiaozhao commented on issue #918:
URL: https://github.com/apache/incubator-kvrocks/issues/918#issuecomment-1410466780

   > > > Hi @xiaobiaozhao , I have a questions: I get two ssd , Is there a way to split the hot data?
   > > 
   > > 
   > > https://github.com/apache/incubator-kvrocks/pull/953/files#diff-e29cedc586b39d07b64be1df007101989a20f7d4452fc18fe23136f6d4ccd331R792
   > > "/mnt/ssd 10G; /mnt/hdd 1T;" hot data cool data
   > 
   > I am not trying to distinguish between hot data and cool data. I mean to make the pressure of both disks the same. Hot data is distributed in two disks.
   
   You can try cluster
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@kvrocks.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-kvrocks] jishengming1 commented on issue #918: support Multi-disk(Multi-path)

Posted by "jishengming1 (via GitHub)" <gi...@apache.org>.
jishengming1 commented on issue #918:
URL: https://github.com/apache/incubator-kvrocks/issues/918#issuecomment-1408439721

   Hi @xiaobiaozhao , I have a questions:
   I get  two ssd , Is there a way to  split the hot  data?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@kvrocks.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-kvrocks] caipengbo commented on issue #918: support Multi-disk(Multi-path)

Posted by GitBox <gi...@apache.org>.
caipengbo commented on issue #918:
URL: https://github.com/apache/incubator-kvrocks/issues/918#issuecomment-1257097594

   Hi @xiaobiaozhao, I have two questions: 
   1. Why can multiple disks improve performance? Multiple paths do not seem to work in parallel 
   How do we distinguish between hot and cold data? Rocksdb is based on the timing of the SST generation.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@kvrocks.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-kvrocks] caipengbo commented on issue #918: support Multi-disk(Multi-path)

Posted by GitBox <gi...@apache.org>.
caipengbo commented on issue #918:
URL: https://github.com/apache/incubator-kvrocks/issues/918#issuecomment-1257361015

   > In fact, the level at which SST is located represents the hot and coldness of the data. Because rocksdb uses the LSM tree, it will naturally merge cold data to a higher level.
   
   @tanruixiang But about 90% of the data falls to the last layer of the LSM, so does that mean that 90% of the data is cold?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@kvrocks.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-kvrocks] xiaobiaozhao commented on issue #918: support Multi-disk(Multi-path)

Posted by GitBox <gi...@apache.org>.
xiaobiaozhao commented on issue #918:
URL: https://github.com/apache/incubator-kvrocks/issues/918#issuecomment-1257314367

   
   
   
   > > Hi @xiaobiaozhao, I have two questions:
   > > 
   > > 1. Why can multiple disks improve performance? Multiple paths do not seem to work in parallel.
   > > 2. How do we judge hot and cold data in kvrocks? Rocksdb simply determines where to place the SST based on when the SST was generated.
   > 
   > According to the description of the configuration, the lower level SST will be stored in the front of the `db_paths`. So we can arrange the `db_paths` according to the speed of the storage medium, and put the low-level SST in the faster storage medium, for example, put the SSD in the first of the `db_paths` to storage the low-level SST.
   > 
   > In fact, the level at which `SST` is located represents the hot and coldness of the data. Because rocksdb uses the LSM tree, it will naturally merge cold data to a higher level.
   > 
   > So if this feature is used, rocksdb can help us store cold data in slower storage media such as mechanical hard drives, and store hot data in faster storage media such as SSD.
   
   Yes,In my test demo,rocksdb use first & last of the `dp_paths` config only. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@kvrocks.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-kvrocks] tanruixiang commented on issue #918: support Multi-disk(Multi-path)

Posted by GitBox <gi...@apache.org>.
tanruixiang commented on issue #918:
URL: https://github.com/apache/incubator-kvrocks/issues/918#issuecomment-1257513937

   > > In fact, the level at which SST is located represents the hot and coldness of the data. Because rocksdb uses the LSM tree, it will naturally merge cold data to a higher level.
   > 
   > @tanruixiang But about 90% of the data falls to the last layer of the LSM, so does that mean that 90% of the data is cold?
   
   Most of the data should be cold data. If it is hot data, it will re-enter the previous layers, and the data in the last layer may be deleted.
   For example, if `key1` is in the last layer and we put `key1` again, then `key1` will go back to the previous layers after going from mmtable to sst, and at the same time, the key1 of the last layer will be invalid. Of course, if a certain data is only read, it should be placed in the cache even if it is in the last layer.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@kvrocks.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-kvrocks] xiaobiaozhao commented on issue #918: support Multi-disk(Multi-path)

Posted by "xiaobiaozhao (via GitHub)" <gi...@apache.org>.
xiaobiaozhao commented on issue #918:
URL: https://github.com/apache/incubator-kvrocks/issues/918#issuecomment-1409911961

   > Hi @xiaobiaozhao , I have a questions: I get two ssd , Is there a way to split the hot data?
   
   https://github.com/apache/incubator-kvrocks/pull/953/files#diff-e29cedc586b39d07b64be1df007101989a20f7d4452fc18fe23136f6d4ccd331R792


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@kvrocks.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-kvrocks] jishengming1 commented on issue #918: support Multi-disk(Multi-path)

Posted by "jishengming1 (via GitHub)" <gi...@apache.org>.
jishengming1 commented on issue #918:
URL: https://github.com/apache/incubator-kvrocks/issues/918#issuecomment-1414651542

   > > I get two ssd , Is there a way to split the hot data?
   > 
   > You can make a raid0 from several disks and place the datadir on it. Or you can use zfs pool consisting of several disks.
   
   Thanks, I'll try.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@kvrocks.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-kvrocks] tanruixiang commented on issue #918: support Multi-disk(Multi-path)

Posted by GitBox <gi...@apache.org>.
tanruixiang commented on issue #918:
URL: https://github.com/apache/incubator-kvrocks/issues/918#issuecomment-1257221420

   > Hi @xiaobiaozhao, I have two questions:
   > 
   > 1. Why can multiple disks improve performance? Multiple paths do not seem to work in parallel.
   > 2. How do we judge hot and cold data in kvrocks? Rocksdb simply determines where to place the SST based on when the SST was generated.
   
   According to the description of the configuration, the lower level SST will be stored in the front of the `db_paths`. So we can arrange the `db_paths` according to the speed of the storage medium, and put the low-level SST in the faster storage medium, for example, put the SSD in the first of the `db_paths` to storage the low-level SST.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@kvrocks.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-kvrocks] jishengming1 commented on issue #918: support Multi-disk(Multi-path)

Posted by "jishengming1 (via GitHub)" <gi...@apache.org>.
jishengming1 commented on issue #918:
URL: https://github.com/apache/incubator-kvrocks/issues/918#issuecomment-1411367234

   > > > > Hi @xiaobiaozhao , I have a questions: I get two ssd , Is there a way to split the hot data?
   > > > 
   > > > 
   > > > https://github.com/apache/incubator-kvrocks/pull/953/files#diff-e29cedc586b39d07b64be1df007101989a20f7d4452fc18fe23136f6d4ccd331R792
   > > > "/mnt/ssd 10G; /mnt/hdd 1T;" hot data cool data
   > > 
   > > 
   > > I am not trying to distinguish between hot data and cool data. I mean to make the pressure of both disks the same. Hot data is distributed in two disks.
   > 
   > You can try cluster
   Thanks, is there any other way if there is only one host?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@kvrocks.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org