You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pegasus.apache.org by GitBox <gi...@apache.org> on 2020/08/21 11:05:01 UTC

[GitHub] [incubator-pegasus] Shuo-Jia opened a new issue #582: support new “data export” feature to replace using backup

Shuo-Jia opened a new issue #582:
URL: https://github.com/apache/incubator-pegasus/issues/582


   ## Feature Request
   
   **Is your feature request related to a problem? Please describe:**
   <!-- A clear and concise description of what the problem is. Ex. I'm always frustrated when [...] -->
   Now if we want to dump/export data fastly, we need use [backup](http://pegasus.apache.org/administration/cold-backup). However, the backup feature is not friendly for dump data, for example:
   * Don't support dump immediately, we need create a task and wait excuting
   * The path contain redundant sub-dir such as `policy name` etc. and don‘t support custom path
   
   Above and some other question result in dumping data is complex and especially using [Pegasus-Spark](https://github.com/pegasus-kv/pegasus-spark) to read the dumped data.
   
   **Describe the feature you'd like:**
   <!-- A clear and concise description of what you want to happen. -->
   I expect the command line should be simple as follow:
   ```shell
   # dump/export once
   # target is hdfs, sub-path is optional, default can be pegasus/cluster_name/table_name
   >> dump/export table_name hdfs://url sub-path
   # target is fds, sub-path is optional, default can be pegasus/cluster_name/table_name
   >> dump/export table_name endpoint bucket sub-path
   
   # dump/export periodicly
   >> dump/export table_name hdfs://url sub-path start_time periodic_time
   ```
   finaly, the data path is:
   ```
   root/pegasus/cluster/table/timestamp/partition/file.sst
   ```
   and then, user can use [Pegasus-Spark](https://github.com/pegasus-kv/pegasus-spark) to read data directly but no need we must offer the `policy name`,`cluster_name`, `table_name`, `fds/hdfs config`.
   
   **Describe alternatives you've considered:**
   <!-- A clear and concise description of any alternative solutions or features you've considered. -->
   Since the `cold backup` code is complex, so we no need refactor it to achive the above result, but can re-implement new feature
   
   **Teachability, Documentation, Adoption, Migration Strategy:**
   <!-- If you can, explain some scenarios how users might use this, situations it would be helpful in. Any API designs, mockups, or diagrams are also helpful. -->
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org