You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kvrocks.apache.org by "git-hulk (via GitHub)" <gi...@apache.org> on 2023/03/08 14:09:01 UTC

[GitHub] [incubator-kvrocks] git-hulk opened a new issue, #1301: Add support of bulk load for the string like HBase bulkload

git-hulk opened a new issue, #1301:
URL: https://github.com/apache/incubator-kvrocks/issues/1301

   ### Search before asking
   
   - [X] I had searched in the [issues](https://github.com/apache/incubator-kvrocks/issues) and found no similar issues.
   
   
   ### Motivation
   
   Many scenarios need to bulk-load mass data regularly, and it may bring heavy workload and latency spike if loads through the API interface. So it will be better if we can offer a way to mitigate this issue.
   
   ### Solution
   
   We can use [RocksDB Ingest SST](https://github.com/EighteenZi/rocksdb_wiki/blob/master/Creating-and-Ingesting-SST-files.md) to bulk load those data and support for simple strings only.
   
   ### Are you willing to submit a PR?
   
   - [ ] I'm willing to submit a PR!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@kvrocks.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-kvrocks] git-hulk commented on issue #1301: Add support of bulk load for the string like HBase bulkload

Posted by "git-hulk (via GitHub)" <gi...@apache.org>.
git-hulk commented on issue #1301:
URL: https://github.com/apache/incubator-kvrocks/issues/1301#issuecomment-1596391840

   > @git-hulk @jihuayu Hi, here is the bulk load ingestion implementation of pegasus. https://github.com/apache/incubator-pegasus/pulls?q=label%3Acomponent%2Fbulk_load+. FYI
   
   Cool, thanks for your input.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@kvrocks.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] Add support of bulk load for the string like HBase bulkload [kvrocks]

Posted by "JackyYangPassion (via GitHub)" <gi...@apache.org>.
JackyYangPassion commented on issue #1301:
URL: https://github.com/apache/kvrocks/issues/1301#issuecomment-2050802195

   
   
   
   > @JackyYangPassion No. Do you want to have a try?
   
   Okk,
   I've been researching how to generate SST files recently.
   
   I looked carefully discussions in https://github.com/apache/kvrocks/discussions/1628 
   
   this pr only  support String Type?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@kvrocks.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-kvrocks] jihuayu commented on issue #1301: Add support of bulk load for the string like HBase bulkload

Posted by "jihuayu (via GitHub)" <gi...@apache.org>.
jihuayu commented on issue #1301:
URL: https://github.com/apache/incubator-kvrocks/issues/1301#issuecomment-1595511383

   @git-hulk  Ok, I'm willing to submit a PR!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@kvrocks.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-kvrocks] liucyao1990 commented on issue #1301: Add support of bulk load for the string like HBase bulkload

Posted by "liucyao1990 (via GitHub)" <gi...@apache.org>.
liucyao1990 commented on issue #1301:
URL: https://github.com/apache/incubator-kvrocks/issues/1301#issuecomment-1596390077

   @git-hulk @jihuayu Hi, here is the bulk load ingestion implementation of pegasus. https://github.com/apache/incubator-pegasus/pulls?q=label%3Acomponent%2Fbulk_load+. FYI


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@kvrocks.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-kvrocks] ColinChamber commented on issue #1301: Add support of bulk load for the string like HBase bulkload

Posted by "ColinChamber (via GitHub)" <gi...@apache.org>.
ColinChamber commented on issue #1301:
URL: https://github.com/apache/incubator-kvrocks/issues/1301#issuecomment-1493590029

    I'm willing to submit a PR! 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@kvrocks.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] Add support of bulk load for the string like HBase bulkload [kvrocks]

Posted by "jihuayu (via GitHub)" <gi...@apache.org>.
jihuayu commented on issue #1301:
URL: https://github.com/apache/kvrocks/issues/1301#issuecomment-2050849332

   @JackyYangPassion Thank you!
   Supporting strings is our first step in the plan. We want to start by creating a basic version to provide to users for their use. This way, we can gather feedback from users on the functionality as early as possible. 
   In the later stages, we will support more types and functionalities.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@kvrocks.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-kvrocks] zuston commented on issue #1301: Add support of bulk load for the string like HBase bulkload

Posted by "zuston (via GitHub)" <gi...@apache.org>.
zuston commented on issue #1301:
URL: https://github.com/apache/incubator-kvrocks/issues/1301#issuecomment-1461214893

   Thanks for  proposing this. +1 for this feature.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@kvrocks.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [kvrocks] git-hulk commented on issue #1301: Add support of bulk load for the string like HBase bulkload

Posted by "git-hulk (via GitHub)" <gi...@apache.org>.
git-hulk commented on issue #1301:
URL: https://github.com/apache/kvrocks/issues/1301#issuecomment-1605260131

   Yes, that's right. It's good to NOT support the replication for now.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@kvrocks.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-kvrocks] liucyao1990 commented on issue #1301: Add support of bulk load for the string like HBase bulkload

Posted by "liucyao1990 (via GitHub)" <gi...@apache.org>.
liucyao1990 commented on issue #1301:
URL: https://github.com/apache/incubator-kvrocks/issues/1301#issuecomment-1582252141

   @git-hulk @ColinChamber Thanks for this PR , Is there any progress?looking forward to this bulkload function


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@kvrocks.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-kvrocks] git-hulk commented on issue #1301: Add support of bulk load for the string like HBase bulkload

Posted by "git-hulk (via GitHub)" <gi...@apache.org>.
git-hulk commented on issue #1301:
URL: https://github.com/apache/incubator-kvrocks/issues/1301#issuecomment-1593346932

   > In my opinion, there are two steps here.
    Create SST files with the data.
    Ingest the SST files.
   
   @jihuayu Yes, you're right. And I think it's good to only support the string type first. 
   
   > Do we need to support online bulk load? Will there be problems with stopping the world?
   
   My intuitive thought is yes for the online bulk load, even though it will block the write operations when ingesting SSTs.
   
   > For this feature, we need provide a command to load data, or provide a tool?
   
   For my side, I would like to support loading the local SSTs via command and also provides a tool to generate SST files. For the tool input file, we can require users to put their data in a specified format like CSV or others. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@kvrocks.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [kvrocks] jihuayu commented on issue #1301: Add support of bulk load for the string like HBase bulkload

Posted by "jihuayu (via GitHub)" <gi...@apache.org>.
jihuayu commented on issue #1301:
URL: https://github.com/apache/kvrocks/issues/1301#issuecomment-1605246361

   I will first create the SST generation tool.
   we have  `cluster` and `replication` mode, `Ingest SST` may be different. I think I can first support Ingest in standalone mode.
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@kvrocks.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-kvrocks] jihuayu commented on issue #1301: Add support of bulk load for the string like HBase bulkload

Posted by "jihuayu (via GitHub)" <gi...@apache.org>.
jihuayu commented on issue #1301:
URL: https://github.com/apache/incubator-kvrocks/issues/1301#issuecomment-1593126518

   @git-hulk For this feature, we need provide a command to load data, or provide a tool?
   
   In my opinion, there are two steps here. 
   1. Create SST files with the data.
   2. Ingest the SST files. 
   
   The second step requires stopping the world.
   
   Do we need to support online bulk load? Will there be problems with stopping the world?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@kvrocks.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-kvrocks] git-hulk commented on issue #1301: Add support of bulk load for the string like HBase bulkload

Posted by "git-hulk (via GitHub)" <gi...@apache.org>.
git-hulk commented on issue #1301:
URL: https://github.com/apache/incubator-kvrocks/issues/1301#issuecomment-1493601572

   @ColinChamber Assigned.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@kvrocks.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] Add support of bulk load for the string like HBase bulkload [kvrocks]

Posted by "jihuayu (via GitHub)" <gi...@apache.org>.
jihuayu commented on issue #1301:
URL: https://github.com/apache/kvrocks/issues/1301#issuecomment-2046239291

   @JackyYangPassion No. Do you want to have a try?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@kvrocks.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] Add support of bulk load for the string like HBase bulkload [kvrocks]

Posted by "git-hulk (via GitHub)" <gi...@apache.org>.
git-hulk commented on issue #1301:
URL: https://github.com/apache/kvrocks/issues/1301#issuecomment-2050831898

   @JackyYangPassion Yes, we would like to support the string first since it's the simplest one. And it's definitely great if can involve other data types.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@kvrocks.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-kvrocks] git-hulk commented on issue #1301: Add support of bulk load for the string like HBase bulkload

Posted by "git-hulk (via GitHub)" <gi...@apache.org>.
git-hulk commented on issue #1301:
URL: https://github.com/apache/incubator-kvrocks/issues/1301#issuecomment-1584006965

   Thanks @ColinChamber for your update.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@kvrocks.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-kvrocks] ColinChamber commented on issue #1301: Add support of bulk load for the string like HBase bulkload

Posted by "ColinChamber (via GitHub)" <gi...@apache.org>.
ColinChamber commented on issue #1301:
URL: https://github.com/apache/incubator-kvrocks/issues/1301#issuecomment-1584006300

   Recently I haven't had enough time. Looking forward to others to achieve it. Unassigned. @liucyao1990 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@kvrocks.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-kvrocks] git-hulk commented on issue #1301: Add support of bulk load for the string like HBase bulkload

Posted by "git-hulk (via GitHub)" <gi...@apache.org>.
git-hulk commented on issue #1301:
URL: https://github.com/apache/incubator-kvrocks/issues/1301#issuecomment-1595709913

   Thanks @jihuayu, assigned.
   
   @zuston @liucyao1990 Also welcome to provide more input about how to use the bulk load.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@kvrocks.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] Add support of bulk load for the string like HBase bulkload [kvrocks]

Posted by "JackyYangPassion (via GitHub)" <gi...@apache.org>.
JackyYangPassion commented on issue #1301:
URL: https://github.com/apache/kvrocks/issues/1301#issuecomment-2044292540

   Are there any updates here?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@kvrocks.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org