You are viewing a plain text version of this content. The canonical link for it is here.
Posted to general@incubator.apache.org by ������ <wu...@gmail.com> on 2020/06/01 15:04:29 UTC

[Proposal] Pegasus - proposal for Apache Incubation

Dear Apache Incubator Community,

I'd like to open up a discussion about incubating Pegasus at Apache. Our proposal can be found at https://pegasus-kv.github.io/community/proposal and is also included below.

We are looking for possible Champion if anyone would like to volunteer. Thanks a lot!

Best regards
  Tao Wu

Pegasus Proposal

== Abstract ==

Pegasus is a distributed key-value storage system that is designed to be horizontally scalable, strongly consistent and high-performance.

- Pegasus codebase: https://github.com/XiaoMi/pegasus
- Website: https://pegasus-kv.github.io

== Proposal ==

Pegasus is a key-value database that delivers low-latency data access together with horizontal scalability, using hash-based partitioning. Pegasus uses PacificA protocol for strong consistency and RocksDB as the underlying storage engine.

We propose to contribute the Pegasus codebase and associated artifacts (e.g., documentation, website content, etc.) to the Apache Software Foundation, and aim to build an open community around Pegasus’s continued development in the ‘Apache Way’.

== Background ==

Apache HBase was recognized as mostly the only large-scale KV store solution in XiaoMi Corp until Pegasus came out in 2015. The original purpose of Pegasus was to solve the problems caused by HBase’s two-level architecture and implementation, including high latency because of Java GC and RPC overhead of the underlying distributed filesystem, and long failover time because of single point of RegionServer and recovery overhead of splitting and replaying the HLog files.

Pegasus aims to fill the gap between Redis and HBase. As the former is in-memory, low latency, but does not provide a strong-consistency guarantee. And unlike the latter, Pegasus server is entirely written in C++ and its read-write path relies merely on the local filesystem.

Apart from performance requirements, we also need a storage system to ensure multiple-level data safety and support fast data migration among data centers, automatic load balancing, and online partition splitting.

After investigating lots of existing storage systems in the open source world, we could hardly find a suitable solution to satisfy all the requirements. So the journey of Pegasus begins.

=== Rationale ===

Pegasus is a mature and active project which has been widely adopted in XiaoMi. After the initial release of open source project in 2017, we have seen a great amount of interest across a diverse set of users and companies.

Our experiences at committers and PMC members on other Apache projects have convinced us that having a long-term home at Apache foundation would be a great fit for the project, to ensure that processes and procedures are in place to keep project and community ‘healthy’ and free of any commercial, political or legal faults.

=== Initial Goal ===

Move the existing codebase, website, documentation, and mailing lists to Apache-hosted infrastructure.
Work with the infrastructure team to implement and approve our code review, build, and testing workflows in the context of the ASF.
Incremental development and releases along with Apache guidelines.

== Current Status ==

Pegasus has been an open-source project on GitHub https://github.com/XiaoMi/pegasus since October 2017.

=== Meritocracy ===

The intent of this proposal is to start building a diverse developer and user community around Pegasus following the ASF meritocracy model. We plan to invite more people as committers if they contribute to this project.

=== Releases ===

Pegasus has undergone multiple public releases, listed here: https://github.com/XiaoMi/pegasus/releases.

These old releases were not performed in the typical ASF fashion. We will adopt the ASF source release process upon joining the incubator.

=== Code Reviews ===

Pegasus’s code reviews are currently public on Github https://github.com/XiaoMi/pegasus/pulls.

=== Community ===

Pegasus seeks to develop developer and user communities during incubation.

=== Core Developers ===

Currently most of the core developers of Pegasus are working in the KV-Storage Team of Xiaomi. Yingchun Lai is one of the Apache Kudu PMC members. Zuoyan Qin is an experienced open-source developer who created sofa-pbrpc in his last job in Baidu. Wei Huang is also an active contributor of Apache Doris (Incubating).

- Zuoyan Qin (https://github.com/qinzuoyan)
- Yuchen He (https://github.com/hycdong)
- Tao Wu (https://github.com/neverchanje)
- Yingchun Lai (https://github.com/acelyc111)
- Wei Huang (https://github.com/vagetablechicken)
- Shuo Jia (https://github.com/Shuo-Jia)
- Liwei Zhao (https://github.com/levy5307)

=== Alignment ===

Pegasus is aligned with several other ASF projects.

We are working on a new feature to load data from the HDFS filesystem. Pegasus can also generate and store checkpoints to HDFS, for both backup and analysis purpose. We currently support offline analysis on checkpoints powered by Apache Spark.

== Known Risks ==

=== Orphaned Products ===

The core developers of XiaoMi’s Pegasus team work full time on this project. There is very little risk of Pegasus getting orphaned since at least one large company (XiaoMi) is extensively using it in production, with currently a scale of 70+ clusters, 800+ tables, and more than 70TB data. Furthermore, since Pegasus was open sourced at the beginning of October 2017, it has received more than 1200 stars and been forked more than 200 times, and also received some issues and pull requests from developers and users outside XiaoMi. We plan to extend and diversify this community further through Apache.

=== Inexperience with Open Source ===

The core developers are all active users and followers of open source. They are already committers and contributors to the Pegasus Github project. All have been involved with the source code that has been released under an open source license, and several of them also have experience developing code in an open source environment.

Several of the developers in XiaoMi’s storage team are committers and/or PMC members on other ASF projects (Kudu, HBase, Doris, etc.). They will guide others to practice the Apache Way together along with other incubator mentors.

=== Homogenous Developers ===

The project has received some contributions from developers outside of XiaoMi, and is starting to attract a user community as well. We hope to continue to encourage contributions from these developers and community members, and grow them into committers as they have time to continue their contributions.

=== Reliance on Salaried Developers ===

XiaoMi invested in Pegasus as a general key-value storage used in company widely. The core developers have been dedicated to this project for nearly five years.

Besides, we look forward to attracting more people outside XiaoMi to contribute to this project, either payed engineers working on storage area, or individual volunteers, as long as they have enthusiasm for the Pegasus project.

=== An Excessive Fascination with the Apache Brand ===

Pegasus is proposing to enter incubation at Apache in order to help efforts to diversify the committer-base, not so much to capitalize on the Apache brand. The Pegasus project is in production use already inside XiaoMi, but is not expected to be a XiaoMi product for external customers. As such, the Pegasus project is not seeking to use the Apache brand as a marketing tool.

== Documentation ==

Information about Pegasus can be found at https://github.com/XiaoMi/pegasus. The following links provide more information about Pegasus in open source:

- Pegasus Website: https://pegasus-kv.github.io
- Codebase at Github: https://github.com/XiaoMi/pegasus
- Issue Tracking: https://github.com/XiaoMi/pegasus/issues
- Releases: https://pegasus-kv.github.io/releases
- Community Guide: https://pegasus-kv.github.io/community

== Initial Source ==

Besides the core codebase, Pegasus also hosts its side projects on Github under XiaoMi Group. Specifically, the initial source includes:

Client libraries with different languages:

- Java-Client: https://github.com/XiaoMi/pegasus-java-client
- Scala-Client: https://github.com/XiaoMi/pegasus-scala-client
- NodeJS-Client: https://github.com/XiaoMi/pegasus-nodejs-client
- Go-Client: https://github.com/XiaoMi/pegasus-go-client
- Python-Client: https://github.com/XiaoMi/pegasus-python-client

Components of Pegasus:

- rDSN: https://github.com/XiaoMi/rdsn
- RocksDB: https://github.com/XiaoMi/pegasus-rocksdb

rDSN was initially a distributed framework developed by Zhenyu Guo from Microsoft, and we have heavily refactored and improved it to make it more fit for Pegasus. rDSN is MIT & Apache-2.0 dual-licensed. The code licensed Apache-2.0 belongs to XiaoMi and the copyright of MIT-licensed code is assigned to Microsoft. It’s in our plan to merge Pegasus and rDSN as one project.

RocksDB is a Facebook-developed storage engine. Pegasus added some enhancements and modifications that may be incompatible with the original implementation. RocksDB is licensed under Apache 2.0 License.

== External Dependencies ==

Pegasus has the following external dependencies.

- RocksDB (Apache)
- Apache Thrift (Apache Software License v2.0)
- Boost (Boost Software License)
- Apache Zookeeper (Apache)
- Google s2geometry (BSD)
- Google gflags (BSD)
- fmtlib (BSD)
- POCO (Boost Software License)
- rapidjson (Tencent)
- libevent (BSD)
- Google gperftools (BSD)
- cameron314/concurrentqueue (BSD)
- cameron314/readerwriterqueue (BSD)
- XiaoMi/galaxy-fds-sdk-cpp (No License)
- jupp0r/prometheus-cpp (MIT)
- curl (The curl license)
- nlohmann/json (MIT)
- abseil-cpp (Apache 2.0)
- antirez/linenoise (BSD-2)
- antirez/sds (BSD-2)

Build and test dependencies:

- Apache Maven (Apache Software License v2.0)
- cmake (BSD)
- Google gtest (Apache Software License v2.0)

== Required Resources ==

=== Mailing List ===

There are currently no mailing lists. The usual mailing lists are expected to be set up when entering incubation:

- private@pegasus.incubator.apache.org
- dev@pegasus.incubator.apache.org
- commits@pegasus.incubator.apache.org

=== Git Repositories ===

Upon entering incubation, we want to move the existing repository from https://github.com/XiaoMi/pegasus to Apache infrastructure like https://github.com/apache/incubator-pegasus.

=== Issue Tracking ===

Pegasus currently uses Github to track issues. Would like to continue to do so while we discuss migration possibilities with the ASF Infra committee.

=== Other Resources ====

The existing code already has unit tests so we will make use of existing Apache continuous testing infrastructure. The resulting load should not be very large.

== Source and Intellectual Property Submission Plan ==

Most of the current code is Apache 2.0 licensed and the copyright is assigned to XiaoMi. If the project enters incubator, XiaoMi will transfer the source code & trademark ownership to ASF via a Software Grant Agreement.

But due to historical issues, Pegasus was based on an MIT licensed code that was initially written by microsoft/rDSN, which has long been actively developed by Pegasus because the original project is unmaintained (modified code is licensed under Apache License 2.0). We aren’t sure if we should request Microsoft for any Contributor License Agreement (CLA) during the IP clearance process.

== Initial Committers ==

- Zuoyan Qin (https://github.com/qinzuoyan, qinzuoyan@xiaomi dot com)
- Weijie Sun (https://github.com/shengofsun, luckyweijie@gmail dot com)
- Yuchen He (https://github.com/hycdong, heyuchen@xiaomi dot com)
- Tao Wu (https://github.com/neverchanje, wutao1@xiaomi dot com)
- Yingchun Lai (https://github.com/acelyc111, laiyingchun@xiaomi dot com)
- Wei Huang (https://github.com/vagetablechicken, huangwei5@xiaomi dot com)
- Shuo Jia (https://github.com/Shuo-Jia, jiashuo1@xiaomi dot com)
- Liwei Zhao (https://github.com/levy5307, zhaoliwei@xiaomi dot com)
- Liuyang Cai (https://github.com/LoveHeat)

== Affiliations ==

Seven of the initial committers are employees of Xiaomi.

== Sponsors ==

=== Champion ===

TODO

=== Nominated Mentors ===

TODO

=== Sponsoring Entity ===

We are requesting the Incubator to sponsor this project.


---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


Re: [DISCUSS] PegasusProposal for Apache Incubation

Posted by Gosling Von <fe...@gmail.com>.
Hi,

Proposal has been transferred to Apache wiki[1]. We would like to hear more voices :-)



[1] https://cwiki.apache.org/confluence/display/INCUBATOR/PegasusProposal <https://cwiki.apache.org/confluence/display/INCUBATOR/PegasusProposal>


Best Regards,
Von Gosling

> On Jun 8, 2020, at 8:04 PM, Justin Mclean <ju...@classsoftware.com> wrote:
> 
> Hi,
> 
> The ASF allows 3rd party code to be included in releases. It wouldn’t be part of the grant and still remain the original headers and  copyright. The ASF also doesn’t fork other communities code, but given this is unmaintained and no longer in active development [1] from what I can see I don’t any issues here.
> 
> Thanks,
> Justin
> 
> 1. https://github.com/microsoft/rDSN
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> For additional commands, e-mail: general-help@incubator.apache.org
> 


Re: [Proposal] Pegasus - proposal for Apache Incubation

Posted by Justin Mclean <ju...@classsoftware.com>.
Hi,

The ASF allows 3rd party code to be included in releases. It wouldn’t be part of the grant and still remain the original headers and  copyright. The ASF also doesn’t fork other communities code, but given this is unmaintained and no longer in active development [1] from what I can see I don’t any issues here.

Thanks,
Justin

1. https://github.com/microsoft/rDSN
---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


Re: [Proposal] Pegasus - proposal for Apache Incubation

Posted by Tao Wu <wu...@gmail.com>.
Sure, Sheng. We certainly don't want to steal any IP from anyone. Sorry if
I didn't make it clear.

My question is, are we legal enough to paste copyright of Microsoft on
every file that originates from them, as we always do? As far as I know, it
suffices to prevent legal problems.

But I'm not familiar that if ASF has any restriction that requires all the
code except 3rd-parties are owned by ASF.

Re: [Proposal] Pegasus - proposal for Apache Incubation

Posted by "张铎 (Duo Zhang)" <pa...@gmail.com>.
Since MIT is compatible with Apache 2.0 so I do not think this is a blocker
issue? We can even bundle the rdsn code into the pegasus main repository I
think.
We can solve this problem during the incubating time.
Of course it will be good if microsoft would like to donate the code.

Sheng Wu <wu...@gmail.com> 于2020年6月8日周一 下午6:00写道:

> 吴涛 <wu...@gmail.com> 于2020年6月8日周一 下午5:22写道:
>
> > Hi, Willem,
> >
> > This is the historical issue we are going to resolve recently.
> >
> > We've reduced the incompatible modifications on facebook/rocksdb [1]
> where
> > pegasus-rocksdb was forked, and I believe we will soon no longer need to
> > maintain
> > this repo. It will be completely used as an external dependency.
> >
> > As for rdsn, since the original repo (fully MIT-licensed)
> > microsoft/rDSN [2] has been
> > unmaintained for a long time, we plan to merge the two repo together.
> We've
> > endeavored a lot
> > to improve and refactor rdsn. I'm not sure if we should ask microsoft for
> > any CLA for the
> > donation of our code.
> >
>
> I am afraid you need to donate this, you have to. Unmaintained can't change
> the fact, its IP belongs to the original team.
>
> Sheng Wu 吴晟
> Twitter, wusheng1108
>
>
> >
> > [1] https://github.com/facebook/rocksdb
> > [2] https://github.com/XiaoMi/rdsn
> >
> > Willem Jiang <wi...@gmail.com> 于2020年6月8日周一 下午4:12写道:
> >
> > > Hi,
> > >
> > > I just went through the proposal and found there are two source code
> > > repos[1][2] which are forked.
> > >
> > > Are you planning to donate these two repos into Apache as a part of
> > > Pegasus project?
> > > It makes the donation of the Pegasus complicated as we need to address
> > > this two code repo belonging issue first.
> > > Normally we just contribute the patch to the upstream as it consume
> > > lots of resources if we maintain the forked repo.
> > >
> > > [1] https://github.com/XiaoMi/rdsn
> > > [2] https://github.com/XiaoMi/pegasus-rocksdb
> > >
> > >
> > > Willem Jiang
> > >
> > > Twitter: willemjiang
> > > Weibo: 姜宁willem
> > >
> > > On Tue, Jun 2, 2020 at 3:49 PM 吴涛 <wu...@gmail.com>
> > wrote:
> > > >
> > > > Dear Apache Incubator Community,
> > > >
> > > > I'd like to open up a discussion about incubating Pegasus at Apache.
> > Our
> > > proposal can be found at
> https://pegasus-kv.github.io/community/proposal
> > > and is also included below.
> > > >
> > > > We are looking for possible Champion if anyone would like to
> volunteer.
> > > Thanks a lot!
> > > >
> > > > Best regards
> > > >   Tao Wu
> > > >
> > > > Pegasus Proposal
> > > >
> > > > == Abstract ==
> > > >
> > > > Pegasus is a distributed key-value storage system that is designed to
> > be
> > > horizontally scalable, strongly consistent and high-performance.
> > > >
> > > > - Pegasus codebase: https://github.com/XiaoMi/pegasus
> > > > - Website: https://pegasus-kv.github.io
> > > >
> > > > == Proposal ==
> > > >
> > > > Pegasus is a key-value database that delivers low-latency data access
> > > together with horizontal scalability, using hash-based partitioning.
> > > Pegasus uses PacificA protocol for strong consistency and RocksDB as
> the
> > > underlying storage engine.
> > > >
> > > > We propose to contribute the Pegasus codebase and associated
> artifacts
> > > (e.g., documentation, website content, etc.) to the Apache Software
> > > Foundation, and aim to build an open community around Pegasus’s
> continued
> > > development in the ‘Apache Way’.
> > > >
> > > > == Background ==
> > > >
> > > > Apache HBase was recognized as mostly the only large-scale KV store
> > > solution in XiaoMi Corp until Pegasus came out in 2015. The original
> > > purpose of Pegasus was to solve the problems caused by HBase’s
> two-level
> > > architecture and implementation, including high latency because of Java
> > GC
> > > and RPC overhead of the underlying distributed filesystem, and long
> > > failover time because of single point of RegionServer and recovery
> > overhead
> > > of splitting and replaying the HLog files.
> > > >
> > > > Pegasus aims to fill the gap between Redis and HBase. As the former
> is
> > > in-memory, low latency, but does not provide a strong-consistency
> > > guarantee. And unlike the latter, Pegasus server is entirely written in
> > C++
> > > and its read-write path relies merely on the local filesystem.
> > > >
> > > > Apart from performance requirements, we also need a storage system to
> > > ensure multiple-level data safety and support fast data migration among
> > > data centers, automatic load balancing, and online partition splitting.
> > > >
> > > > After investigating lots of existing storage systems in the open
> source
> > > world, we could hardly find a suitable solution to satisfy all the
> > > requirements. So the journey of Pegasus begins.
> > > >
> > > > === Rationale ===
> > > >
> > > > Pegasus is a mature and active project which has been widely adopted
> in
> > > XiaoMi. After the initial release of open source project in 2017, we
> have
> > > seen a great amount of interest across a diverse set of users and
> > companies.
> > > >
> > > > Our experiences at committers and PMC members on other Apache
> projects
> > > have convinced us that having a long-term home at Apache foundation
> would
> > > be a great fit for the project, to ensure that processes and procedures
> > are
> > > in place to keep project and community ‘healthy’ and free of any
> > > commercial, political or legal faults.
> > > >
> > > > === Initial Goal ===
> > > >
> > > > Move the existing codebase, website, documentation, and mailing lists
> > to
> > > Apache-hosted infrastructure.
> > > > Work with the infrastructure team to implement and approve our code
> > > review, build, and testing workflows in the context of the ASF.
> > > > Incremental development and releases along with Apache guidelines.
> > > >
> > > > == Current Status ==
> > > >
> > > > Pegasus has been an open-source project on GitHub
> > > https://github.com/XiaoMi/pegasus since October 2017.
> > > >
> > > > === Meritocracy ===
> > > >
> > > > The intent of this proposal is to start building a diverse developer
> > and
> > > user community around Pegasus following the ASF meritocracy model. We
> > plan
> > > to invite more people as committers if they contribute to this project.
> > > >
> > > > === Releases ===
> > > >
> > > > Pegasus has undergone multiple public releases, listed here:
> > > https://github.com/XiaoMi/pegasus/releases.
> > > >
> > > > These old releases were not performed in the typical ASF fashion. We
> > > will adopt the ASF source release process upon joining the incubator.
> > > >
> > > > === Code Reviews ===
> > > >
> > > > Pegasus’s code reviews are currently public on Github
> > > https://github.com/XiaoMi/pegasus/pulls.
> > > >
> > > > === Community ===
> > > >
> > > > Pegasus seeks to develop developer and user communities during
> > > incubation.
> > > >
> > > > === Core Developers ===
> > > >
> > > > Currently most of the core developers of Pegasus are working in the
> > > KV-Storage Team of Xiaomi. Yingchun Lai is one of the Apache Kudu PMC
> > > members. Zuoyan Qin is an experienced open-source developer who created
> > > sofa-pbrpc in his last job in Baidu. Wei Huang is also an active
> > > contributor of Apache Doris (Incubating).
> > > >
> > > > - Zuoyan Qin (https://github.com/qinzuoyan)
> > > > - Yuchen He (https://github.com/hycdong)
> > > > - Tao Wu (https://github.com/neverchanje)
> > > > - Yingchun Lai (https://github.com/acelyc111)
> > > > - Wei Huang (https://github.com/vagetablechicken)
> > > > - Shuo Jia (https://github.com/Shuo-Jia)
> > > > - Liwei Zhao (https://github.com/levy5307)
> > > >
> > > > === Alignment ===
> > > >
> > > > Pegasus is aligned with several other ASF projects.
> > > >
> > > > We are working on a new feature to load data from the HDFS
> filesystem.
> > > Pegasus can also generate and store checkpoints to HDFS, for both
> backup
> > > and analysis purpose. We currently support offline analysis on
> > checkpoints
> > > powered by Apache Spark.
> > > >
> > > > == Known Risks ==
> > > >
> > > > === Orphaned Products ===
> > > >
> > > > The core developers of XiaoMi’s Pegasus team work full time on this
> > > project. There is very little risk of Pegasus getting orphaned since at
> > > least one large company (XiaoMi) is extensively using it in production,
> > > with currently a scale of 70+ clusters, 800+ tables, and more than 70TB
> > > data. Furthermore, since Pegasus was open sourced at the beginning of
> > > October 2017, it has received more than 1200 stars and been forked more
> > > than 200 times, and also received some issues and pull requests from
> > > developers and users outside XiaoMi. We plan to extend and diversify
> this
> > > community further through Apache.
> > > >
> > > > === Inexperience with Open Source ===
> > > >
> > > > The core developers are all active users and followers of open
> source.
> > > They are already committers and contributors to the Pegasus Github
> > project.
> > > All have been involved with the source code that has been released
> under
> > an
> > > open source license, and several of them also have experience
> developing
> > > code in an open source environment.
> > > >
> > > > Several of the developers in XiaoMi’s storage team are committers
> > and/or
> > > PMC members on other ASF projects (Kudu, HBase, Doris, etc.). They will
> > > guide others to practice the Apache Way together along with other
> > incubator
> > > mentors.
> > > >
> > > > === Homogenous Developers ===
> > > >
> > > > The project has received some contributions from developers outside
> of
> > > XiaoMi, and is starting to attract a user community as well. We hope to
> > > continue to encourage contributions from these developers and community
> > > members, and grow them into committers as they have time to continue
> > their
> > > contributions.
> > > >
> > > > === Reliance on Salaried Developers ===
> > > >
> > > > XiaoMi invested in Pegasus as a general key-value storage used in
> > > company widely. The core developers have been dedicated to this project
> > for
> > > nearly five years.
> > > >
> > > > Besides, we look forward to attracting more people outside XiaoMi to
> > > contribute to this project, either payed engineers working on storage
> > area,
> > > or individual volunteers, as long as they have enthusiasm for the
> Pegasus
> > > project.
> > > >
> > > > === An Excessive Fascination with the Apache Brand ===
> > > >
> > > > Pegasus is proposing to enter incubation at Apache in order to help
> > > efforts to diversify the committer-base, not so much to capitalize on
> the
> > > Apache brand. The Pegasus project is in production use already inside
> > > XiaoMi, but is not expected to be a XiaoMi product for external
> > customers.
> > > As such, the Pegasus project is not seeking to use the Apache brand as
> a
> > > marketing tool.
> > > >
> > > > == Documentation ==
> > > >
> > > > Information about Pegasus can be found at
> > > https://github.com/XiaoMi/pegasus. The following links provide more
> > > information about Pegasus in open source:
> > > >
> > > > - Pegasus Website: https://pegasus-kv.github.io
> > > > - Codebase at Github: https://github.com/XiaoMi/pegasus
> > > > - Issue Tracking: https://github.com/XiaoMi/pegasus/issues
> > > > - Releases: https://pegasus-kv.github.io/releases
> > > > - Community Guide: https://pegasus-kv.github.io/community
> > > >
> > > > == Initial Source ==
> > > >
> > > > Besides the core codebase, Pegasus also hosts its side projects on
> > > Github under XiaoMi Group. Specifically, the initial source includes:
> > > >
> > > > Client libraries with different languages:
> > > >
> > > > - Java-Client: https://github.com/XiaoMi/pegasus-java-client
> > > > - Scala-Client: https://github.com/XiaoMi/pegasus-scala-client
> > > > - NodeJS-Client: https://github.com/XiaoMi/pegasus-nodejs-client
> > > > - Go-Client: https://github.com/XiaoMi/pegasus-go-client
> > > > - Python-Client: https://github.com/XiaoMi/pegasus-python-client
> > > >
> > > > Components of Pegasus:
> > > >
> > > > - rDSN: https://github.com/XiaoMi/rdsn
> > > > - RocksDB: https://github.com/XiaoMi/pegasus-rocksdb
> > > >
> > > > rDSN was initially a distributed framework developed by Zhenyu Guo
> from
> > > Microsoft, and we have heavily refactored and improved it to make it
> more
> > > fit for Pegasus. rDSN is MIT & Apache-2.0 dual-licensed. The code
> > licensed
> > > Apache-2.0 belongs to XiaoMi and the copyright of MIT-licensed code is
> > > assigned to Microsoft. It’s in our plan to merge Pegasus and rDSN as
> one
> > > project.
> > > >
> > > > RocksDB is a Facebook-developed storage engine. Pegasus added some
> > > enhancements and modifications that may be incompatible with the
> original
> > > implementation. RocksDB is licensed under Apache 2.0 License.
> > > >
> > > > == External Dependencies ==
> > > >
> > > > Pegasus has the following external dependencies.
> > > >
> > > > - RocksDB (Apache)
> > > > - Apache Thrift (Apache Software License v2.0)
> > > > - Boost (Boost Software License)
> > > > - Apache Zookeeper (Apache)
> > > > - Google s2geometry (BSD)
> > > > - Google gflags (BSD)
> > > > - fmtlib (BSD)
> > > > - POCO (Boost Software License)
> > > > - rapidjson (Tencent)
> > > > - libevent (BSD)
> > > > - Google gperftools (BSD)
> > > > - cameron314/concurrentqueue (BSD)
> > > > - cameron314/readerwriterqueue (BSD)
> > > > - XiaoMi/galaxy-fds-sdk-cpp (No License)
> > > > - jupp0r/prometheus-cpp (MIT)
> > > > - curl (The curl license)
> > > > - nlohmann/json (MIT)
> > > > - abseil-cpp (Apache 2.0)
> > > > - antirez/linenoise (BSD-2)
> > > > - antirez/sds (BSD-2)
> > > >
> > > > Build and test dependencies:
> > > >
> > > > - Apache Maven (Apache Software License v2.0)
> > > > - cmake (BSD)
> > > > - Google gtest (Apache Software License v2.0)
> > > >
> > > > == Required Resources ==
> > > >
> > > > === Mailing List ===
> > > >
> > > > There are currently no mailing lists. The usual mailing lists are
> > > expected to be set up when entering incubation:
> > > >
> > > > - private@pegasus.incubator.apache.org
> > > > - dev@pegasus.incubator.apache.org
> > > > - commits@pegasus.incubator.apache.org
> > > >
> > > > === Git Repositories ===
> > > >
> > > > Upon entering incubation, we want to move the existing repository
> from
> > > https://github.com/XiaoMi/pegasus to Apache infrastructure like
> > > https://github.com/apache/incubator-pegasus.
> > > >
> > > > === Issue Tracking ===
> > > >
> > > > Pegasus currently uses Github to track issues. Would like to continue
> > to
> > > do so while we discuss migration possibilities with the ASF Infra
> > committee.
> > > >
> > > > === Other Resources ====
> > > >
> > > > The existing code already has unit tests so we will make use of
> > existing
> > > Apache continuous testing infrastructure. The resulting load should not
> > be
> > > very large.
> > > >
> > > > == Source and Intellectual Property Submission Plan ==
> > > >
> > > > Most of the current code is Apache 2.0 licensed and the copyright is
> > > assigned to XiaoMi. If the project enters incubator, XiaoMi will
> transfer
> > > the source code & trademark ownership to ASF via a Software Grant
> > Agreement.
> > > >
> > > > But due to historical issues, Pegasus was based on an MIT licensed
> code
> > > that was initially written by microsoft/rDSN, which has long been
> > actively
> > > developed by Pegasus because the original project is unmaintained
> > (modified
> > > code is licensed under Apache License 2.0). We aren’t sure if we should
> > > request Microsoft for any Contributor License Agreement (CLA) during
> the
> > IP
> > > clearance process.
> > > >
> > > > == Initial Committers ==
> > > >
> > > > - Zuoyan Qin (https://github.com/qinzuoyan, qinzuoyan@xiaomi dot
> com)
> > > > - Weijie Sun (https://github.com/shengofsun, luckyweijie@gmail dot
> > com)
> > > > - Yuchen He (https://github.com/hycdong, heyuchen@xiaomi dot com)
> > > > - Tao Wu (https://github.com/neverchanje, wutao1@xiaomi dot com)
> > > > - Yingchun Lai (https://github.com/acelyc111, laiyingchun@xiaomi dot
> > > com)
> > > > - Wei Huang (https://github.com/vagetablechicken, huangwei5@xiaomi
> dot
> > > com)
> > > > - Shuo Jia (https://github.com/Shuo-Jia, jiashuo1@xiaomi dot com)
> > > > - Liwei Zhao (https://github.com/levy5307, zhaoliwei@xiaomi dot com)
> > > > - Liuyang Cai (https://github.com/LoveHeat)
> > > >
> > > > == Affiliations ==
> > > >
> > > > Seven of the initial committers are employees of Xiaomi.
> > > >
> > > > == Sponsors ==
> > > >
> > > > === Champion ===
> > > >
> > > > TODO
> > > >
> > > > === Nominated Mentors ===
> > > >
> > > > TODO
> > > >
> > > > === Sponsoring Entity ===
> > > >
> > > > We are requesting the Incubator to sponsor this project.
> > > >
> > > >
> > > > ---------------------------------------------------------------------
> > > > To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> > > > For additional commands, e-mail: general-help@incubator.apache.org
> > > >
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> > > For additional commands, e-mail: general-help@incubator.apache.org
> > >
> > >
> >
>

Re: [Proposal] Pegasus - proposal for Apache Incubation

Posted by Sheng Wu <wu...@gmail.com>.
吴涛 <wu...@gmail.com> 于2020年6月8日周一 下午5:22写道:

> Hi, Willem,
>
> This is the historical issue we are going to resolve recently.
>
> We've reduced the incompatible modifications on facebook/rocksdb [1] where
> pegasus-rocksdb was forked, and I believe we will soon no longer need to
> maintain
> this repo. It will be completely used as an external dependency.
>
> As for rdsn, since the original repo (fully MIT-licensed)
> microsoft/rDSN [2] has been
> unmaintained for a long time, we plan to merge the two repo together. We've
> endeavored a lot
> to improve and refactor rdsn. I'm not sure if we should ask microsoft for
> any CLA for the
> donation of our code.
>

I am afraid you need to donate this, you have to. Unmaintained can't change
the fact, its IP belongs to the original team.

Sheng Wu 吴晟
Twitter, wusheng1108


>
> [1] https://github.com/facebook/rocksdb
> [2] https://github.com/XiaoMi/rdsn
>
> Willem Jiang <wi...@gmail.com> 于2020年6月8日周一 下午4:12写道:
>
> > Hi,
> >
> > I just went through the proposal and found there are two source code
> > repos[1][2] which are forked.
> >
> > Are you planning to donate these two repos into Apache as a part of
> > Pegasus project?
> > It makes the donation of the Pegasus complicated as we need to address
> > this two code repo belonging issue first.
> > Normally we just contribute the patch to the upstream as it consume
> > lots of resources if we maintain the forked repo.
> >
> > [1] https://github.com/XiaoMi/rdsn
> > [2] https://github.com/XiaoMi/pegasus-rocksdb
> >
> >
> > Willem Jiang
> >
> > Twitter: willemjiang
> > Weibo: 姜宁willem
> >
> > On Tue, Jun 2, 2020 at 3:49 PM 吴涛 <wu...@gmail.com>
> wrote:
> > >
> > > Dear Apache Incubator Community,
> > >
> > > I'd like to open up a discussion about incubating Pegasus at Apache.
> Our
> > proposal can be found at https://pegasus-kv.github.io/community/proposal
> > and is also included below.
> > >
> > > We are looking for possible Champion if anyone would like to volunteer.
> > Thanks a lot!
> > >
> > > Best regards
> > >   Tao Wu
> > >
> > > Pegasus Proposal
> > >
> > > == Abstract ==
> > >
> > > Pegasus is a distributed key-value storage system that is designed to
> be
> > horizontally scalable, strongly consistent and high-performance.
> > >
> > > - Pegasus codebase: https://github.com/XiaoMi/pegasus
> > > - Website: https://pegasus-kv.github.io
> > >
> > > == Proposal ==
> > >
> > > Pegasus is a key-value database that delivers low-latency data access
> > together with horizontal scalability, using hash-based partitioning.
> > Pegasus uses PacificA protocol for strong consistency and RocksDB as the
> > underlying storage engine.
> > >
> > > We propose to contribute the Pegasus codebase and associated artifacts
> > (e.g., documentation, website content, etc.) to the Apache Software
> > Foundation, and aim to build an open community around Pegasus’s continued
> > development in the ‘Apache Way’.
> > >
> > > == Background ==
> > >
> > > Apache HBase was recognized as mostly the only large-scale KV store
> > solution in XiaoMi Corp until Pegasus came out in 2015. The original
> > purpose of Pegasus was to solve the problems caused by HBase’s two-level
> > architecture and implementation, including high latency because of Java
> GC
> > and RPC overhead of the underlying distributed filesystem, and long
> > failover time because of single point of RegionServer and recovery
> overhead
> > of splitting and replaying the HLog files.
> > >
> > > Pegasus aims to fill the gap between Redis and HBase. As the former is
> > in-memory, low latency, but does not provide a strong-consistency
> > guarantee. And unlike the latter, Pegasus server is entirely written in
> C++
> > and its read-write path relies merely on the local filesystem.
> > >
> > > Apart from performance requirements, we also need a storage system to
> > ensure multiple-level data safety and support fast data migration among
> > data centers, automatic load balancing, and online partition splitting.
> > >
> > > After investigating lots of existing storage systems in the open source
> > world, we could hardly find a suitable solution to satisfy all the
> > requirements. So the journey of Pegasus begins.
> > >
> > > === Rationale ===
> > >
> > > Pegasus is a mature and active project which has been widely adopted in
> > XiaoMi. After the initial release of open source project in 2017, we have
> > seen a great amount of interest across a diverse set of users and
> companies.
> > >
> > > Our experiences at committers and PMC members on other Apache projects
> > have convinced us that having a long-term home at Apache foundation would
> > be a great fit for the project, to ensure that processes and procedures
> are
> > in place to keep project and community ‘healthy’ and free of any
> > commercial, political or legal faults.
> > >
> > > === Initial Goal ===
> > >
> > > Move the existing codebase, website, documentation, and mailing lists
> to
> > Apache-hosted infrastructure.
> > > Work with the infrastructure team to implement and approve our code
> > review, build, and testing workflows in the context of the ASF.
> > > Incremental development and releases along with Apache guidelines.
> > >
> > > == Current Status ==
> > >
> > > Pegasus has been an open-source project on GitHub
> > https://github.com/XiaoMi/pegasus since October 2017.
> > >
> > > === Meritocracy ===
> > >
> > > The intent of this proposal is to start building a diverse developer
> and
> > user community around Pegasus following the ASF meritocracy model. We
> plan
> > to invite more people as committers if they contribute to this project.
> > >
> > > === Releases ===
> > >
> > > Pegasus has undergone multiple public releases, listed here:
> > https://github.com/XiaoMi/pegasus/releases.
> > >
> > > These old releases were not performed in the typical ASF fashion. We
> > will adopt the ASF source release process upon joining the incubator.
> > >
> > > === Code Reviews ===
> > >
> > > Pegasus’s code reviews are currently public on Github
> > https://github.com/XiaoMi/pegasus/pulls.
> > >
> > > === Community ===
> > >
> > > Pegasus seeks to develop developer and user communities during
> > incubation.
> > >
> > > === Core Developers ===
> > >
> > > Currently most of the core developers of Pegasus are working in the
> > KV-Storage Team of Xiaomi. Yingchun Lai is one of the Apache Kudu PMC
> > members. Zuoyan Qin is an experienced open-source developer who created
> > sofa-pbrpc in his last job in Baidu. Wei Huang is also an active
> > contributor of Apache Doris (Incubating).
> > >
> > > - Zuoyan Qin (https://github.com/qinzuoyan)
> > > - Yuchen He (https://github.com/hycdong)
> > > - Tao Wu (https://github.com/neverchanje)
> > > - Yingchun Lai (https://github.com/acelyc111)
> > > - Wei Huang (https://github.com/vagetablechicken)
> > > - Shuo Jia (https://github.com/Shuo-Jia)
> > > - Liwei Zhao (https://github.com/levy5307)
> > >
> > > === Alignment ===
> > >
> > > Pegasus is aligned with several other ASF projects.
> > >
> > > We are working on a new feature to load data from the HDFS filesystem.
> > Pegasus can also generate and store checkpoints to HDFS, for both backup
> > and analysis purpose. We currently support offline analysis on
> checkpoints
> > powered by Apache Spark.
> > >
> > > == Known Risks ==
> > >
> > > === Orphaned Products ===
> > >
> > > The core developers of XiaoMi’s Pegasus team work full time on this
> > project. There is very little risk of Pegasus getting orphaned since at
> > least one large company (XiaoMi) is extensively using it in production,
> > with currently a scale of 70+ clusters, 800+ tables, and more than 70TB
> > data. Furthermore, since Pegasus was open sourced at the beginning of
> > October 2017, it has received more than 1200 stars and been forked more
> > than 200 times, and also received some issues and pull requests from
> > developers and users outside XiaoMi. We plan to extend and diversify this
> > community further through Apache.
> > >
> > > === Inexperience with Open Source ===
> > >
> > > The core developers are all active users and followers of open source.
> > They are already committers and contributors to the Pegasus Github
> project.
> > All have been involved with the source code that has been released under
> an
> > open source license, and several of them also have experience developing
> > code in an open source environment.
> > >
> > > Several of the developers in XiaoMi’s storage team are committers
> and/or
> > PMC members on other ASF projects (Kudu, HBase, Doris, etc.). They will
> > guide others to practice the Apache Way together along with other
> incubator
> > mentors.
> > >
> > > === Homogenous Developers ===
> > >
> > > The project has received some contributions from developers outside of
> > XiaoMi, and is starting to attract a user community as well. We hope to
> > continue to encourage contributions from these developers and community
> > members, and grow them into committers as they have time to continue
> their
> > contributions.
> > >
> > > === Reliance on Salaried Developers ===
> > >
> > > XiaoMi invested in Pegasus as a general key-value storage used in
> > company widely. The core developers have been dedicated to this project
> for
> > nearly five years.
> > >
> > > Besides, we look forward to attracting more people outside XiaoMi to
> > contribute to this project, either payed engineers working on storage
> area,
> > or individual volunteers, as long as they have enthusiasm for the Pegasus
> > project.
> > >
> > > === An Excessive Fascination with the Apache Brand ===
> > >
> > > Pegasus is proposing to enter incubation at Apache in order to help
> > efforts to diversify the committer-base, not so much to capitalize on the
> > Apache brand. The Pegasus project is in production use already inside
> > XiaoMi, but is not expected to be a XiaoMi product for external
> customers.
> > As such, the Pegasus project is not seeking to use the Apache brand as a
> > marketing tool.
> > >
> > > == Documentation ==
> > >
> > > Information about Pegasus can be found at
> > https://github.com/XiaoMi/pegasus. The following links provide more
> > information about Pegasus in open source:
> > >
> > > - Pegasus Website: https://pegasus-kv.github.io
> > > - Codebase at Github: https://github.com/XiaoMi/pegasus
> > > - Issue Tracking: https://github.com/XiaoMi/pegasus/issues
> > > - Releases: https://pegasus-kv.github.io/releases
> > > - Community Guide: https://pegasus-kv.github.io/community
> > >
> > > == Initial Source ==
> > >
> > > Besides the core codebase, Pegasus also hosts its side projects on
> > Github under XiaoMi Group. Specifically, the initial source includes:
> > >
> > > Client libraries with different languages:
> > >
> > > - Java-Client: https://github.com/XiaoMi/pegasus-java-client
> > > - Scala-Client: https://github.com/XiaoMi/pegasus-scala-client
> > > - NodeJS-Client: https://github.com/XiaoMi/pegasus-nodejs-client
> > > - Go-Client: https://github.com/XiaoMi/pegasus-go-client
> > > - Python-Client: https://github.com/XiaoMi/pegasus-python-client
> > >
> > > Components of Pegasus:
> > >
> > > - rDSN: https://github.com/XiaoMi/rdsn
> > > - RocksDB: https://github.com/XiaoMi/pegasus-rocksdb
> > >
> > > rDSN was initially a distributed framework developed by Zhenyu Guo from
> > Microsoft, and we have heavily refactored and improved it to make it more
> > fit for Pegasus. rDSN is MIT & Apache-2.0 dual-licensed. The code
> licensed
> > Apache-2.0 belongs to XiaoMi and the copyright of MIT-licensed code is
> > assigned to Microsoft. It’s in our plan to merge Pegasus and rDSN as one
> > project.
> > >
> > > RocksDB is a Facebook-developed storage engine. Pegasus added some
> > enhancements and modifications that may be incompatible with the original
> > implementation. RocksDB is licensed under Apache 2.0 License.
> > >
> > > == External Dependencies ==
> > >
> > > Pegasus has the following external dependencies.
> > >
> > > - RocksDB (Apache)
> > > - Apache Thrift (Apache Software License v2.0)
> > > - Boost (Boost Software License)
> > > - Apache Zookeeper (Apache)
> > > - Google s2geometry (BSD)
> > > - Google gflags (BSD)
> > > - fmtlib (BSD)
> > > - POCO (Boost Software License)
> > > - rapidjson (Tencent)
> > > - libevent (BSD)
> > > - Google gperftools (BSD)
> > > - cameron314/concurrentqueue (BSD)
> > > - cameron314/readerwriterqueue (BSD)
> > > - XiaoMi/galaxy-fds-sdk-cpp (No License)
> > > - jupp0r/prometheus-cpp (MIT)
> > > - curl (The curl license)
> > > - nlohmann/json (MIT)
> > > - abseil-cpp (Apache 2.0)
> > > - antirez/linenoise (BSD-2)
> > > - antirez/sds (BSD-2)
> > >
> > > Build and test dependencies:
> > >
> > > - Apache Maven (Apache Software License v2.0)
> > > - cmake (BSD)
> > > - Google gtest (Apache Software License v2.0)
> > >
> > > == Required Resources ==
> > >
> > > === Mailing List ===
> > >
> > > There are currently no mailing lists. The usual mailing lists are
> > expected to be set up when entering incubation:
> > >
> > > - private@pegasus.incubator.apache.org
> > > - dev@pegasus.incubator.apache.org
> > > - commits@pegasus.incubator.apache.org
> > >
> > > === Git Repositories ===
> > >
> > > Upon entering incubation, we want to move the existing repository from
> > https://github.com/XiaoMi/pegasus to Apache infrastructure like
> > https://github.com/apache/incubator-pegasus.
> > >
> > > === Issue Tracking ===
> > >
> > > Pegasus currently uses Github to track issues. Would like to continue
> to
> > do so while we discuss migration possibilities with the ASF Infra
> committee.
> > >
> > > === Other Resources ====
> > >
> > > The existing code already has unit tests so we will make use of
> existing
> > Apache continuous testing infrastructure. The resulting load should not
> be
> > very large.
> > >
> > > == Source and Intellectual Property Submission Plan ==
> > >
> > > Most of the current code is Apache 2.0 licensed and the copyright is
> > assigned to XiaoMi. If the project enters incubator, XiaoMi will transfer
> > the source code & trademark ownership to ASF via a Software Grant
> Agreement.
> > >
> > > But due to historical issues, Pegasus was based on an MIT licensed code
> > that was initially written by microsoft/rDSN, which has long been
> actively
> > developed by Pegasus because the original project is unmaintained
> (modified
> > code is licensed under Apache License 2.0). We aren’t sure if we should
> > request Microsoft for any Contributor License Agreement (CLA) during the
> IP
> > clearance process.
> > >
> > > == Initial Committers ==
> > >
> > > - Zuoyan Qin (https://github.com/qinzuoyan, qinzuoyan@xiaomi dot com)
> > > - Weijie Sun (https://github.com/shengofsun, luckyweijie@gmail dot
> com)
> > > - Yuchen He (https://github.com/hycdong, heyuchen@xiaomi dot com)
> > > - Tao Wu (https://github.com/neverchanje, wutao1@xiaomi dot com)
> > > - Yingchun Lai (https://github.com/acelyc111, laiyingchun@xiaomi dot
> > com)
> > > - Wei Huang (https://github.com/vagetablechicken, huangwei5@xiaomi dot
> > com)
> > > - Shuo Jia (https://github.com/Shuo-Jia, jiashuo1@xiaomi dot com)
> > > - Liwei Zhao (https://github.com/levy5307, zhaoliwei@xiaomi dot com)
> > > - Liuyang Cai (https://github.com/LoveHeat)
> > >
> > > == Affiliations ==
> > >
> > > Seven of the initial committers are employees of Xiaomi.
> > >
> > > == Sponsors ==
> > >
> > > === Champion ===
> > >
> > > TODO
> > >
> > > === Nominated Mentors ===
> > >
> > > TODO
> > >
> > > === Sponsoring Entity ===
> > >
> > > We are requesting the Incubator to sponsor this project.
> > >
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> > > For additional commands, e-mail: general-help@incubator.apache.org
> > >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> > For additional commands, e-mail: general-help@incubator.apache.org
> >
> >
>

Re: [Proposal] Pegasus - proposal for Apache Incubation

Posted by 吴涛 <wu...@gmail.com>.
Hi, Willem,

This is the historical issue we are going to resolve recently.

We've reduced the incompatible modifications on facebook/rocksdb [1] where
pegasus-rocksdb was forked, and I believe we will soon no longer need to
maintain
this repo. It will be completely used as an external dependency.

As for rdsn, since the original repo (fully MIT-licensed)
microsoft/rDSN [2] has been
unmaintained for a long time, we plan to merge the two repo together. We've
endeavored a lot
to improve and refactor rdsn. I'm not sure if we should ask microsoft for
any CLA for the
donation of our code.

[1] https://github.com/facebook/rocksdb
[2] https://github.com/XiaoMi/rdsn

Willem Jiang <wi...@gmail.com> 于2020年6月8日周一 下午4:12写道:

> Hi,
>
> I just went through the proposal and found there are two source code
> repos[1][2] which are forked.
>
> Are you planning to donate these two repos into Apache as a part of
> Pegasus project?
> It makes the donation of the Pegasus complicated as we need to address
> this two code repo belonging issue first.
> Normally we just contribute the patch to the upstream as it consume
> lots of resources if we maintain the forked repo.
>
> [1] https://github.com/XiaoMi/rdsn
> [2] https://github.com/XiaoMi/pegasus-rocksdb
>
>
> Willem Jiang
>
> Twitter: willemjiang
> Weibo: 姜宁willem
>
> On Tue, Jun 2, 2020 at 3:49 PM 吴涛 <wu...@gmail.com> wrote:
> >
> > Dear Apache Incubator Community,
> >
> > I'd like to open up a discussion about incubating Pegasus at Apache. Our
> proposal can be found at https://pegasus-kv.github.io/community/proposal
> and is also included below.
> >
> > We are looking for possible Champion if anyone would like to volunteer.
> Thanks a lot!
> >
> > Best regards
> >   Tao Wu
> >
> > Pegasus Proposal
> >
> > == Abstract ==
> >
> > Pegasus is a distributed key-value storage system that is designed to be
> horizontally scalable, strongly consistent and high-performance.
> >
> > - Pegasus codebase: https://github.com/XiaoMi/pegasus
> > - Website: https://pegasus-kv.github.io
> >
> > == Proposal ==
> >
> > Pegasus is a key-value database that delivers low-latency data access
> together with horizontal scalability, using hash-based partitioning.
> Pegasus uses PacificA protocol for strong consistency and RocksDB as the
> underlying storage engine.
> >
> > We propose to contribute the Pegasus codebase and associated artifacts
> (e.g., documentation, website content, etc.) to the Apache Software
> Foundation, and aim to build an open community around Pegasus’s continued
> development in the ‘Apache Way’.
> >
> > == Background ==
> >
> > Apache HBase was recognized as mostly the only large-scale KV store
> solution in XiaoMi Corp until Pegasus came out in 2015. The original
> purpose of Pegasus was to solve the problems caused by HBase’s two-level
> architecture and implementation, including high latency because of Java GC
> and RPC overhead of the underlying distributed filesystem, and long
> failover time because of single point of RegionServer and recovery overhead
> of splitting and replaying the HLog files.
> >
> > Pegasus aims to fill the gap between Redis and HBase. As the former is
> in-memory, low latency, but does not provide a strong-consistency
> guarantee. And unlike the latter, Pegasus server is entirely written in C++
> and its read-write path relies merely on the local filesystem.
> >
> > Apart from performance requirements, we also need a storage system to
> ensure multiple-level data safety and support fast data migration among
> data centers, automatic load balancing, and online partition splitting.
> >
> > After investigating lots of existing storage systems in the open source
> world, we could hardly find a suitable solution to satisfy all the
> requirements. So the journey of Pegasus begins.
> >
> > === Rationale ===
> >
> > Pegasus is a mature and active project which has been widely adopted in
> XiaoMi. After the initial release of open source project in 2017, we have
> seen a great amount of interest across a diverse set of users and companies.
> >
> > Our experiences at committers and PMC members on other Apache projects
> have convinced us that having a long-term home at Apache foundation would
> be a great fit for the project, to ensure that processes and procedures are
> in place to keep project and community ‘healthy’ and free of any
> commercial, political or legal faults.
> >
> > === Initial Goal ===
> >
> > Move the existing codebase, website, documentation, and mailing lists to
> Apache-hosted infrastructure.
> > Work with the infrastructure team to implement and approve our code
> review, build, and testing workflows in the context of the ASF.
> > Incremental development and releases along with Apache guidelines.
> >
> > == Current Status ==
> >
> > Pegasus has been an open-source project on GitHub
> https://github.com/XiaoMi/pegasus since October 2017.
> >
> > === Meritocracy ===
> >
> > The intent of this proposal is to start building a diverse developer and
> user community around Pegasus following the ASF meritocracy model. We plan
> to invite more people as committers if they contribute to this project.
> >
> > === Releases ===
> >
> > Pegasus has undergone multiple public releases, listed here:
> https://github.com/XiaoMi/pegasus/releases.
> >
> > These old releases were not performed in the typical ASF fashion. We
> will adopt the ASF source release process upon joining the incubator.
> >
> > === Code Reviews ===
> >
> > Pegasus’s code reviews are currently public on Github
> https://github.com/XiaoMi/pegasus/pulls.
> >
> > === Community ===
> >
> > Pegasus seeks to develop developer and user communities during
> incubation.
> >
> > === Core Developers ===
> >
> > Currently most of the core developers of Pegasus are working in the
> KV-Storage Team of Xiaomi. Yingchun Lai is one of the Apache Kudu PMC
> members. Zuoyan Qin is an experienced open-source developer who created
> sofa-pbrpc in his last job in Baidu. Wei Huang is also an active
> contributor of Apache Doris (Incubating).
> >
> > - Zuoyan Qin (https://github.com/qinzuoyan)
> > - Yuchen He (https://github.com/hycdong)
> > - Tao Wu (https://github.com/neverchanje)
> > - Yingchun Lai (https://github.com/acelyc111)
> > - Wei Huang (https://github.com/vagetablechicken)
> > - Shuo Jia (https://github.com/Shuo-Jia)
> > - Liwei Zhao (https://github.com/levy5307)
> >
> > === Alignment ===
> >
> > Pegasus is aligned with several other ASF projects.
> >
> > We are working on a new feature to load data from the HDFS filesystem.
> Pegasus can also generate and store checkpoints to HDFS, for both backup
> and analysis purpose. We currently support offline analysis on checkpoints
> powered by Apache Spark.
> >
> > == Known Risks ==
> >
> > === Orphaned Products ===
> >
> > The core developers of XiaoMi’s Pegasus team work full time on this
> project. There is very little risk of Pegasus getting orphaned since at
> least one large company (XiaoMi) is extensively using it in production,
> with currently a scale of 70+ clusters, 800+ tables, and more than 70TB
> data. Furthermore, since Pegasus was open sourced at the beginning of
> October 2017, it has received more than 1200 stars and been forked more
> than 200 times, and also received some issues and pull requests from
> developers and users outside XiaoMi. We plan to extend and diversify this
> community further through Apache.
> >
> > === Inexperience with Open Source ===
> >
> > The core developers are all active users and followers of open source.
> They are already committers and contributors to the Pegasus Github project.
> All have been involved with the source code that has been released under an
> open source license, and several of them also have experience developing
> code in an open source environment.
> >
> > Several of the developers in XiaoMi’s storage team are committers and/or
> PMC members on other ASF projects (Kudu, HBase, Doris, etc.). They will
> guide others to practice the Apache Way together along with other incubator
> mentors.
> >
> > === Homogenous Developers ===
> >
> > The project has received some contributions from developers outside of
> XiaoMi, and is starting to attract a user community as well. We hope to
> continue to encourage contributions from these developers and community
> members, and grow them into committers as they have time to continue their
> contributions.
> >
> > === Reliance on Salaried Developers ===
> >
> > XiaoMi invested in Pegasus as a general key-value storage used in
> company widely. The core developers have been dedicated to this project for
> nearly five years.
> >
> > Besides, we look forward to attracting more people outside XiaoMi to
> contribute to this project, either payed engineers working on storage area,
> or individual volunteers, as long as they have enthusiasm for the Pegasus
> project.
> >
> > === An Excessive Fascination with the Apache Brand ===
> >
> > Pegasus is proposing to enter incubation at Apache in order to help
> efforts to diversify the committer-base, not so much to capitalize on the
> Apache brand. The Pegasus project is in production use already inside
> XiaoMi, but is not expected to be a XiaoMi product for external customers.
> As such, the Pegasus project is not seeking to use the Apache brand as a
> marketing tool.
> >
> > == Documentation ==
> >
> > Information about Pegasus can be found at
> https://github.com/XiaoMi/pegasus. The following links provide more
> information about Pegasus in open source:
> >
> > - Pegasus Website: https://pegasus-kv.github.io
> > - Codebase at Github: https://github.com/XiaoMi/pegasus
> > - Issue Tracking: https://github.com/XiaoMi/pegasus/issues
> > - Releases: https://pegasus-kv.github.io/releases
> > - Community Guide: https://pegasus-kv.github.io/community
> >
> > == Initial Source ==
> >
> > Besides the core codebase, Pegasus also hosts its side projects on
> Github under XiaoMi Group. Specifically, the initial source includes:
> >
> > Client libraries with different languages:
> >
> > - Java-Client: https://github.com/XiaoMi/pegasus-java-client
> > - Scala-Client: https://github.com/XiaoMi/pegasus-scala-client
> > - NodeJS-Client: https://github.com/XiaoMi/pegasus-nodejs-client
> > - Go-Client: https://github.com/XiaoMi/pegasus-go-client
> > - Python-Client: https://github.com/XiaoMi/pegasus-python-client
> >
> > Components of Pegasus:
> >
> > - rDSN: https://github.com/XiaoMi/rdsn
> > - RocksDB: https://github.com/XiaoMi/pegasus-rocksdb
> >
> > rDSN was initially a distributed framework developed by Zhenyu Guo from
> Microsoft, and we have heavily refactored and improved it to make it more
> fit for Pegasus. rDSN is MIT & Apache-2.0 dual-licensed. The code licensed
> Apache-2.0 belongs to XiaoMi and the copyright of MIT-licensed code is
> assigned to Microsoft. It’s in our plan to merge Pegasus and rDSN as one
> project.
> >
> > RocksDB is a Facebook-developed storage engine. Pegasus added some
> enhancements and modifications that may be incompatible with the original
> implementation. RocksDB is licensed under Apache 2.0 License.
> >
> > == External Dependencies ==
> >
> > Pegasus has the following external dependencies.
> >
> > - RocksDB (Apache)
> > - Apache Thrift (Apache Software License v2.0)
> > - Boost (Boost Software License)
> > - Apache Zookeeper (Apache)
> > - Google s2geometry (BSD)
> > - Google gflags (BSD)
> > - fmtlib (BSD)
> > - POCO (Boost Software License)
> > - rapidjson (Tencent)
> > - libevent (BSD)
> > - Google gperftools (BSD)
> > - cameron314/concurrentqueue (BSD)
> > - cameron314/readerwriterqueue (BSD)
> > - XiaoMi/galaxy-fds-sdk-cpp (No License)
> > - jupp0r/prometheus-cpp (MIT)
> > - curl (The curl license)
> > - nlohmann/json (MIT)
> > - abseil-cpp (Apache 2.0)
> > - antirez/linenoise (BSD-2)
> > - antirez/sds (BSD-2)
> >
> > Build and test dependencies:
> >
> > - Apache Maven (Apache Software License v2.0)
> > - cmake (BSD)
> > - Google gtest (Apache Software License v2.0)
> >
> > == Required Resources ==
> >
> > === Mailing List ===
> >
> > There are currently no mailing lists. The usual mailing lists are
> expected to be set up when entering incubation:
> >
> > - private@pegasus.incubator.apache.org
> > - dev@pegasus.incubator.apache.org
> > - commits@pegasus.incubator.apache.org
> >
> > === Git Repositories ===
> >
> > Upon entering incubation, we want to move the existing repository from
> https://github.com/XiaoMi/pegasus to Apache infrastructure like
> https://github.com/apache/incubator-pegasus.
> >
> > === Issue Tracking ===
> >
> > Pegasus currently uses Github to track issues. Would like to continue to
> do so while we discuss migration possibilities with the ASF Infra committee.
> >
> > === Other Resources ====
> >
> > The existing code already has unit tests so we will make use of existing
> Apache continuous testing infrastructure. The resulting load should not be
> very large.
> >
> > == Source and Intellectual Property Submission Plan ==
> >
> > Most of the current code is Apache 2.0 licensed and the copyright is
> assigned to XiaoMi. If the project enters incubator, XiaoMi will transfer
> the source code & trademark ownership to ASF via a Software Grant Agreement.
> >
> > But due to historical issues, Pegasus was based on an MIT licensed code
> that was initially written by microsoft/rDSN, which has long been actively
> developed by Pegasus because the original project is unmaintained (modified
> code is licensed under Apache License 2.0). We aren’t sure if we should
> request Microsoft for any Contributor License Agreement (CLA) during the IP
> clearance process.
> >
> > == Initial Committers ==
> >
> > - Zuoyan Qin (https://github.com/qinzuoyan, qinzuoyan@xiaomi dot com)
> > - Weijie Sun (https://github.com/shengofsun, luckyweijie@gmail dot com)
> > - Yuchen He (https://github.com/hycdong, heyuchen@xiaomi dot com)
> > - Tao Wu (https://github.com/neverchanje, wutao1@xiaomi dot com)
> > - Yingchun Lai (https://github.com/acelyc111, laiyingchun@xiaomi dot
> com)
> > - Wei Huang (https://github.com/vagetablechicken, huangwei5@xiaomi dot
> com)
> > - Shuo Jia (https://github.com/Shuo-Jia, jiashuo1@xiaomi dot com)
> > - Liwei Zhao (https://github.com/levy5307, zhaoliwei@xiaomi dot com)
> > - Liuyang Cai (https://github.com/LoveHeat)
> >
> > == Affiliations ==
> >
> > Seven of the initial committers are employees of Xiaomi.
> >
> > == Sponsors ==
> >
> > === Champion ===
> >
> > TODO
> >
> > === Nominated Mentors ===
> >
> > TODO
> >
> > === Sponsoring Entity ===
> >
> > We are requesting the Incubator to sponsor this project.
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> > For additional commands, e-mail: general-help@incubator.apache.org
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> For additional commands, e-mail: general-help@incubator.apache.org
>
>

Re: [Proposal] Pegasus - proposal for Apache Incubation

Posted by Willem Jiang <wi...@gmail.com>.
Hi,

I just went through the proposal and found there are two source code
repos[1][2] which are forked.

Are you planning to donate these two repos into Apache as a part of
Pegasus project?
It makes the donation of the Pegasus complicated as we need to address
this two code repo belonging issue first.
Normally we just contribute the patch to the upstream as it consume
lots of resources if we maintain the forked repo.

[1] https://github.com/XiaoMi/rdsn
[2] https://github.com/XiaoMi/pegasus-rocksdb


Willem Jiang

Twitter: willemjiang
Weibo: 姜宁willem

On Tue, Jun 2, 2020 at 3:49 PM 吴涛 <wu...@gmail.com> wrote:
>
> Dear Apache Incubator Community,
>
> I'd like to open up a discussion about incubating Pegasus at Apache. Our proposal can be found at https://pegasus-kv.github.io/community/proposal and is also included below.
>
> We are looking for possible Champion if anyone would like to volunteer. Thanks a lot!
>
> Best regards
>   Tao Wu
>
> Pegasus Proposal
>
> == Abstract ==
>
> Pegasus is a distributed key-value storage system that is designed to be horizontally scalable, strongly consistent and high-performance.
>
> - Pegasus codebase: https://github.com/XiaoMi/pegasus
> - Website: https://pegasus-kv.github.io
>
> == Proposal ==
>
> Pegasus is a key-value database that delivers low-latency data access together with horizontal scalability, using hash-based partitioning. Pegasus uses PacificA protocol for strong consistency and RocksDB as the underlying storage engine.
>
> We propose to contribute the Pegasus codebase and associated artifacts (e.g., documentation, website content, etc.) to the Apache Software Foundation, and aim to build an open community around Pegasus’s continued development in the ‘Apache Way’.
>
> == Background ==
>
> Apache HBase was recognized as mostly the only large-scale KV store solution in XiaoMi Corp until Pegasus came out in 2015. The original purpose of Pegasus was to solve the problems caused by HBase’s two-level architecture and implementation, including high latency because of Java GC and RPC overhead of the underlying distributed filesystem, and long failover time because of single point of RegionServer and recovery overhead of splitting and replaying the HLog files.
>
> Pegasus aims to fill the gap between Redis and HBase. As the former is in-memory, low latency, but does not provide a strong-consistency guarantee. And unlike the latter, Pegasus server is entirely written in C++ and its read-write path relies merely on the local filesystem.
>
> Apart from performance requirements, we also need a storage system to ensure multiple-level data safety and support fast data migration among data centers, automatic load balancing, and online partition splitting.
>
> After investigating lots of existing storage systems in the open source world, we could hardly find a suitable solution to satisfy all the requirements. So the journey of Pegasus begins.
>
> === Rationale ===
>
> Pegasus is a mature and active project which has been widely adopted in XiaoMi. After the initial release of open source project in 2017, we have seen a great amount of interest across a diverse set of users and companies.
>
> Our experiences at committers and PMC members on other Apache projects have convinced us that having a long-term home at Apache foundation would be a great fit for the project, to ensure that processes and procedures are in place to keep project and community ‘healthy’ and free of any commercial, political or legal faults.
>
> === Initial Goal ===
>
> Move the existing codebase, website, documentation, and mailing lists to Apache-hosted infrastructure.
> Work with the infrastructure team to implement and approve our code review, build, and testing workflows in the context of the ASF.
> Incremental development and releases along with Apache guidelines.
>
> == Current Status ==
>
> Pegasus has been an open-source project on GitHub https://github.com/XiaoMi/pegasus since October 2017.
>
> === Meritocracy ===
>
> The intent of this proposal is to start building a diverse developer and user community around Pegasus following the ASF meritocracy model. We plan to invite more people as committers if they contribute to this project.
>
> === Releases ===
>
> Pegasus has undergone multiple public releases, listed here: https://github.com/XiaoMi/pegasus/releases.
>
> These old releases were not performed in the typical ASF fashion. We will adopt the ASF source release process upon joining the incubator.
>
> === Code Reviews ===
>
> Pegasus’s code reviews are currently public on Github https://github.com/XiaoMi/pegasus/pulls.
>
> === Community ===
>
> Pegasus seeks to develop developer and user communities during incubation.
>
> === Core Developers ===
>
> Currently most of the core developers of Pegasus are working in the KV-Storage Team of Xiaomi. Yingchun Lai is one of the Apache Kudu PMC members. Zuoyan Qin is an experienced open-source developer who created sofa-pbrpc in his last job in Baidu. Wei Huang is also an active contributor of Apache Doris (Incubating).
>
> - Zuoyan Qin (https://github.com/qinzuoyan)
> - Yuchen He (https://github.com/hycdong)
> - Tao Wu (https://github.com/neverchanje)
> - Yingchun Lai (https://github.com/acelyc111)
> - Wei Huang (https://github.com/vagetablechicken)
> - Shuo Jia (https://github.com/Shuo-Jia)
> - Liwei Zhao (https://github.com/levy5307)
>
> === Alignment ===
>
> Pegasus is aligned with several other ASF projects.
>
> We are working on a new feature to load data from the HDFS filesystem. Pegasus can also generate and store checkpoints to HDFS, for both backup and analysis purpose. We currently support offline analysis on checkpoints powered by Apache Spark.
>
> == Known Risks ==
>
> === Orphaned Products ===
>
> The core developers of XiaoMi’s Pegasus team work full time on this project. There is very little risk of Pegasus getting orphaned since at least one large company (XiaoMi) is extensively using it in production, with currently a scale of 70+ clusters, 800+ tables, and more than 70TB data. Furthermore, since Pegasus was open sourced at the beginning of October 2017, it has received more than 1200 stars and been forked more than 200 times, and also received some issues and pull requests from developers and users outside XiaoMi. We plan to extend and diversify this community further through Apache.
>
> === Inexperience with Open Source ===
>
> The core developers are all active users and followers of open source. They are already committers and contributors to the Pegasus Github project. All have been involved with the source code that has been released under an open source license, and several of them also have experience developing code in an open source environment.
>
> Several of the developers in XiaoMi’s storage team are committers and/or PMC members on other ASF projects (Kudu, HBase, Doris, etc.). They will guide others to practice the Apache Way together along with other incubator mentors.
>
> === Homogenous Developers ===
>
> The project has received some contributions from developers outside of XiaoMi, and is starting to attract a user community as well. We hope to continue to encourage contributions from these developers and community members, and grow them into committers as they have time to continue their contributions.
>
> === Reliance on Salaried Developers ===
>
> XiaoMi invested in Pegasus as a general key-value storage used in company widely. The core developers have been dedicated to this project for nearly five years.
>
> Besides, we look forward to attracting more people outside XiaoMi to contribute to this project, either payed engineers working on storage area, or individual volunteers, as long as they have enthusiasm for the Pegasus project.
>
> === An Excessive Fascination with the Apache Brand ===
>
> Pegasus is proposing to enter incubation at Apache in order to help efforts to diversify the committer-base, not so much to capitalize on the Apache brand. The Pegasus project is in production use already inside XiaoMi, but is not expected to be a XiaoMi product for external customers. As such, the Pegasus project is not seeking to use the Apache brand as a marketing tool.
>
> == Documentation ==
>
> Information about Pegasus can be found at https://github.com/XiaoMi/pegasus. The following links provide more information about Pegasus in open source:
>
> - Pegasus Website: https://pegasus-kv.github.io
> - Codebase at Github: https://github.com/XiaoMi/pegasus
> - Issue Tracking: https://github.com/XiaoMi/pegasus/issues
> - Releases: https://pegasus-kv.github.io/releases
> - Community Guide: https://pegasus-kv.github.io/community
>
> == Initial Source ==
>
> Besides the core codebase, Pegasus also hosts its side projects on Github under XiaoMi Group. Specifically, the initial source includes:
>
> Client libraries with different languages:
>
> - Java-Client: https://github.com/XiaoMi/pegasus-java-client
> - Scala-Client: https://github.com/XiaoMi/pegasus-scala-client
> - NodeJS-Client: https://github.com/XiaoMi/pegasus-nodejs-client
> - Go-Client: https://github.com/XiaoMi/pegasus-go-client
> - Python-Client: https://github.com/XiaoMi/pegasus-python-client
>
> Components of Pegasus:
>
> - rDSN: https://github.com/XiaoMi/rdsn
> - RocksDB: https://github.com/XiaoMi/pegasus-rocksdb
>
> rDSN was initially a distributed framework developed by Zhenyu Guo from Microsoft, and we have heavily refactored and improved it to make it more fit for Pegasus. rDSN is MIT & Apache-2.0 dual-licensed. The code licensed Apache-2.0 belongs to XiaoMi and the copyright of MIT-licensed code is assigned to Microsoft. It’s in our plan to merge Pegasus and rDSN as one project.
>
> RocksDB is a Facebook-developed storage engine. Pegasus added some enhancements and modifications that may be incompatible with the original implementation. RocksDB is licensed under Apache 2.0 License.
>
> == External Dependencies ==
>
> Pegasus has the following external dependencies.
>
> - RocksDB (Apache)
> - Apache Thrift (Apache Software License v2.0)
> - Boost (Boost Software License)
> - Apache Zookeeper (Apache)
> - Google s2geometry (BSD)
> - Google gflags (BSD)
> - fmtlib (BSD)
> - POCO (Boost Software License)
> - rapidjson (Tencent)
> - libevent (BSD)
> - Google gperftools (BSD)
> - cameron314/concurrentqueue (BSD)
> - cameron314/readerwriterqueue (BSD)
> - XiaoMi/galaxy-fds-sdk-cpp (No License)
> - jupp0r/prometheus-cpp (MIT)
> - curl (The curl license)
> - nlohmann/json (MIT)
> - abseil-cpp (Apache 2.0)
> - antirez/linenoise (BSD-2)
> - antirez/sds (BSD-2)
>
> Build and test dependencies:
>
> - Apache Maven (Apache Software License v2.0)
> - cmake (BSD)
> - Google gtest (Apache Software License v2.0)
>
> == Required Resources ==
>
> === Mailing List ===
>
> There are currently no mailing lists. The usual mailing lists are expected to be set up when entering incubation:
>
> - private@pegasus.incubator.apache.org
> - dev@pegasus.incubator.apache.org
> - commits@pegasus.incubator.apache.org
>
> === Git Repositories ===
>
> Upon entering incubation, we want to move the existing repository from https://github.com/XiaoMi/pegasus to Apache infrastructure like https://github.com/apache/incubator-pegasus.
>
> === Issue Tracking ===
>
> Pegasus currently uses Github to track issues. Would like to continue to do so while we discuss migration possibilities with the ASF Infra committee.
>
> === Other Resources ====
>
> The existing code already has unit tests so we will make use of existing Apache continuous testing infrastructure. The resulting load should not be very large.
>
> == Source and Intellectual Property Submission Plan ==
>
> Most of the current code is Apache 2.0 licensed and the copyright is assigned to XiaoMi. If the project enters incubator, XiaoMi will transfer the source code & trademark ownership to ASF via a Software Grant Agreement.
>
> But due to historical issues, Pegasus was based on an MIT licensed code that was initially written by microsoft/rDSN, which has long been actively developed by Pegasus because the original project is unmaintained (modified code is licensed under Apache License 2.0). We aren’t sure if we should request Microsoft for any Contributor License Agreement (CLA) during the IP clearance process.
>
> == Initial Committers ==
>
> - Zuoyan Qin (https://github.com/qinzuoyan, qinzuoyan@xiaomi dot com)
> - Weijie Sun (https://github.com/shengofsun, luckyweijie@gmail dot com)
> - Yuchen He (https://github.com/hycdong, heyuchen@xiaomi dot com)
> - Tao Wu (https://github.com/neverchanje, wutao1@xiaomi dot com)
> - Yingchun Lai (https://github.com/acelyc111, laiyingchun@xiaomi dot com)
> - Wei Huang (https://github.com/vagetablechicken, huangwei5@xiaomi dot com)
> - Shuo Jia (https://github.com/Shuo-Jia, jiashuo1@xiaomi dot com)
> - Liwei Zhao (https://github.com/levy5307, zhaoliwei@xiaomi dot com)
> - Liuyang Cai (https://github.com/LoveHeat)
>
> == Affiliations ==
>
> Seven of the initial committers are employees of Xiaomi.
>
> == Sponsors ==
>
> === Champion ===
>
> TODO
>
> === Nominated Mentors ===
>
> TODO
>
> === Sponsoring Entity ===
>
> We are requesting the Incubator to sponsor this project.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> For additional commands, e-mail: general-help@incubator.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


Re: [Proposal] Pegasus - proposal for Apache Incubation

Posted by Tao Wu <wu...@gmail.com>.
Welcome, Liang! Thank you for offering to help. Now we have enough mentors.

Liang Chen <ch...@gmail.com> 于2020年6月10日周三 下午10:46写道:

> Hi Tao Wu
>
> I am willing to be a mentor. I am familiar with big data , can give some
> help for incubating.
>
>
> Regards
> Liang
>
> Justin Mclean wrote
> > Hi,
> >
> >> Thank you for your interest, Kevin. Since now we have 2 mentors, shall
> we
> >> call for a vote?
> >
> > Given Kevin said he only 1/2 mentor it would be preferable if you had one
> > more mentor. Anyone willing to help out?
> >
> > Thanks,
> > Justin
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail:
>
> > general-unsubscribe@.apache
>
> > For additional commands, e-mail:
>
> > general-help@.apache
>
>
>
>
>
> --
> Sent from: http://apache-incubator-general.996316.n3.nabble.com/
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> For additional commands, e-mail: general-help@incubator.apache.org
>
>

Re: [Proposal] Pegasus - proposal for Apache Incubation

Posted by Liang Chen <ch...@gmail.com>.
Hi Tao Wu

I am willing to be a mentor. I am familiar with big data , can give some
help for incubating.


Regards
Liang

Justin Mclean wrote
> Hi,
> 
>> Thank you for your interest, Kevin. Since now we have 2 mentors, shall we
>> call for a vote?
> 
> Given Kevin said he only 1/2 mentor it would be preferable if you had one
> more mentor. Anyone willing to help out?
> 
> Thanks,
> Justin
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: 

> general-unsubscribe@.apache

> For additional commands, e-mail: 

> general-help@.apache





--
Sent from: http://apache-incubator-general.996316.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


Re: [Proposal] Pegasus - proposal for Apache Incubation

Posted by Justin Mclean <ju...@classsoftware.com>.
Hi,

> Thank you for your interest, Kevin. Since now we have 2 mentors, shall we
> call for a vote?

Given Kevin said he only 1/2 mentor it would be preferable if you had one more mentor. Anyone willing to help out?

Thanks,
Justin
---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


Re: [Proposal] Pegasus - proposal for Apache Incubation

Posted by Tao Wu <wu...@gmail.com>.
Thank you for your interest, Kevin. Since now we have 2 mentors, shall we
call for a vote?
@Gosling Von <fe...@gmail.com>

Kevin A. McGrail <km...@apache.org> 于2020年6月10日周三 上午2:41写道:

> While I am honored to be asked, I am mentoring actively a number of
> projects right now with another one being proposed.
> Please count me as a mentor but seek out others and consider me in reality
> to be a 1/2 a mentor :-)
>
> --
> Kevin A. McGrail
> Member, Apache Software Foundation
> Chair Emeritus Apache SpamAssassin Project
> https://www.linkedin.com/in/kmcgrail - 703.798.0171
>
>
> On Mon, Jun 8, 2020 at 4:13 AM 吴涛 <wu...@gmail.com> wrote:
>
> > Thanks, Kevin. I'd very much appreciate it if you can become one of our
> > mentors.
> >
> > Regards
> >
> > Tao.
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> > For additional commands, e-mail: general-help@incubator.apache.org
> >
> >
>

Re: [Proposal] Pegasus - proposal for Apache Incubation

Posted by "Kevin A. McGrail" <km...@apache.org>.
While I am honored to be asked, I am mentoring actively a number of
projects right now with another one being proposed.
Please count me as a mentor but seek out others and consider me in reality
to be a 1/2 a mentor :-)

--
Kevin A. McGrail
Member, Apache Software Foundation
Chair Emeritus Apache SpamAssassin Project
https://www.linkedin.com/in/kmcgrail - 703.798.0171


On Mon, Jun 8, 2020 at 4:13 AM 吴涛 <wu...@gmail.com> wrote:

> Thanks, Kevin. I'd very much appreciate it if you can become one of our
> mentors.
>
> Regards
>
> Tao.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> For additional commands, e-mail: general-help@incubator.apache.org
>
>

Re: [Proposal] Pegasus - proposal for Apache Incubation

Posted by ������ <wu...@gmail.com>.
Thanks, Kevin. I'd very much appreciate it if you can become one of our mentors.

Regards

Tao.

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


Re: [Proposal] Pegasus - proposal for Apache Incubation

Posted by "Kevin A. McGrail" <km...@apache.org>.
This looks very interesting.  I've used Redis a long time and Pegasus looks
very interesting.

I'd like to see a champion and some mentors but otherwise I really like
what I see here.

Regards,
KAM
--
Kevin A. McGrail
Member, Apache Software Foundation
Chair Emeritus Apache SpamAssassin Project
https://www.linkedin.com/in/kmcgrail - 703.798.0171


On Tue, Jun 2, 2020 at 3:49 AM 吴涛 <wu...@gmail.com> wrote:

> Dear Apache Incubator Community,
>
> I'd like to open up a discussion about incubating Pegasus at Apache. Our
> proposal can be found at https://pegasus-kv.github.io/community/proposal
> and is also included below.
>
> We are looking for possible Champion if anyone would like to volunteer.
> Thanks a lot!
>
> Best regards
>   Tao Wu
>
> Pegasus Proposal
>
> == Abstract ==
>
> Pegasus is a distributed key-value storage system that is designed to be
> horizontally scalable, strongly consistent and high-performance.
>
> - Pegasus codebase: https://github.com/XiaoMi/pegasus
> - Website: https://pegasus-kv.github.io
>
> == Proposal ==
>
> Pegasus is a key-value database that delivers low-latency data access
> together with horizontal scalability, using hash-based partitioning.
> Pegasus uses PacificA protocol for strong consistency and RocksDB as the
> underlying storage engine.
>
> We propose to contribute the Pegasus codebase and associated artifacts
> (e.g., documentation, website content, etc.) to the Apache Software
> Foundation, and aim to build an open community around Pegasus’s continued
> development in the ‘Apache Way’.
>
> == Background ==
>
> Apache HBase was recognized as mostly the only large-scale KV store
> solution in XiaoMi Corp until Pegasus came out in 2015. The original
> purpose of Pegasus was to solve the problems caused by HBase’s two-level
> architecture and implementation, including high latency because of Java GC
> and RPC overhead of the underlying distributed filesystem, and long
> failover time because of single point of RegionServer and recovery overhead
> of splitting and replaying the HLog files.
>
> Pegasus aims to fill the gap between Redis and HBase. As the former is
> in-memory, low latency, but does not provide a strong-consistency
> guarantee. And unlike the latter, Pegasus server is entirely written in C++
> and its read-write path relies merely on the local filesystem.
>
> Apart from performance requirements, we also need a storage system to
> ensure multiple-level data safety and support fast data migration among
> data centers, automatic load balancing, and online partition splitting.
>
> After investigating lots of existing storage systems in the open source
> world, we could hardly find a suitable solution to satisfy all the
> requirements. So the journey of Pegasus begins.
>
> === Rationale ===
>
> Pegasus is a mature and active project which has been widely adopted in
> XiaoMi. After the initial release of open source project in 2017, we have
> seen a great amount of interest across a diverse set of users and companies.
>
> Our experiences at committers and PMC members on other Apache projects
> have convinced us that having a long-term home at Apache foundation would
> be a great fit for the project, to ensure that processes and procedures are
> in place to keep project and community ‘healthy’ and free of any
> commercial, political or legal faults.
>
> === Initial Goal ===
>
> Move the existing codebase, website, documentation, and mailing lists to
> Apache-hosted infrastructure.
> Work with the infrastructure team to implement and approve our code
> review, build, and testing workflows in the context of the ASF.
> Incremental development and releases along with Apache guidelines.
>
> == Current Status ==
>
> Pegasus has been an open-source project on GitHub
> https://github.com/XiaoMi/pegasus since October 2017.
>
> === Meritocracy ===
>
> The intent of this proposal is to start building a diverse developer and
> user community around Pegasus following the ASF meritocracy model. We plan
> to invite more people as committers if they contribute to this project.
>
> === Releases ===
>
> Pegasus has undergone multiple public releases, listed here:
> https://github.com/XiaoMi/pegasus/releases.
>
> These old releases were not performed in the typical ASF fashion. We will
> adopt the ASF source release process upon joining the incubator.
>
> === Code Reviews ===
>
> Pegasus’s code reviews are currently public on Github
> https://github.com/XiaoMi/pegasus/pulls.
>
> === Community ===
>
> Pegasus seeks to develop developer and user communities during incubation.
>
> === Core Developers ===
>
> Currently most of the core developers of Pegasus are working in the
> KV-Storage Team of Xiaomi. Yingchun Lai is one of the Apache Kudu PMC
> members. Zuoyan Qin is an experienced open-source developer who created
> sofa-pbrpc in his last job in Baidu. Wei Huang is also an active
> contributor of Apache Doris (Incubating).
>
> - Zuoyan Qin (https://github.com/qinzuoyan)
> - Yuchen He (https://github.com/hycdong)
> - Tao Wu (https://github.com/neverchanje)
> - Yingchun Lai (https://github.com/acelyc111)
> - Wei Huang (https://github.com/vagetablechicken)
> - Shuo Jia (https://github.com/Shuo-Jia)
> - Liwei Zhao (https://github.com/levy5307)
>
> === Alignment ===
>
> Pegasus is aligned with several other ASF projects.
>
> We are working on a new feature to load data from the HDFS filesystem.
> Pegasus can also generate and store checkpoints to HDFS, for both backup
> and analysis purpose. We currently support offline analysis on checkpoints
> powered by Apache Spark.
>
> == Known Risks ==
>
> === Orphaned Products ===
>
> The core developers of XiaoMi’s Pegasus team work full time on this
> project. There is very little risk of Pegasus getting orphaned since at
> least one large company (XiaoMi) is extensively using it in production,
> with currently a scale of 70+ clusters, 800+ tables, and more than 70TB
> data. Furthermore, since Pegasus was open sourced at the beginning of
> October 2017, it has received more than 1200 stars and been forked more
> than 200 times, and also received some issues and pull requests from
> developers and users outside XiaoMi. We plan to extend and diversify this
> community further through Apache.
>
> === Inexperience with Open Source ===
>
> The core developers are all active users and followers of open source.
> They are already committers and contributors to the Pegasus Github project.
> All have been involved with the source code that has been released under an
> open source license, and several of them also have experience developing
> code in an open source environment.
>
> Several of the developers in XiaoMi’s storage team are committers and/or
> PMC members on other ASF projects (Kudu, HBase, Doris, etc.). They will
> guide others to practice the Apache Way together along with other incubator
> mentors.
>
> === Homogenous Developers ===
>
> The project has received some contributions from developers outside of
> XiaoMi, and is starting to attract a user community as well. We hope to
> continue to encourage contributions from these developers and community
> members, and grow them into committers as they have time to continue their
> contributions.
>
> === Reliance on Salaried Developers ===
>
> XiaoMi invested in Pegasus as a general key-value storage used in company
> widely. The core developers have been dedicated to this project for nearly
> five years.
>
> Besides, we look forward to attracting more people outside XiaoMi to
> contribute to this project, either payed engineers working on storage area,
> or individual volunteers, as long as they have enthusiasm for the Pegasus
> project.
>
> === An Excessive Fascination with the Apache Brand ===
>
> Pegasus is proposing to enter incubation at Apache in order to help
> efforts to diversify the committer-base, not so much to capitalize on the
> Apache brand. The Pegasus project is in production use already inside
> XiaoMi, but is not expected to be a XiaoMi product for external customers.
> As such, the Pegasus project is not seeking to use the Apache brand as a
> marketing tool.
>
> == Documentation ==
>
> Information about Pegasus can be found at
> https://github.com/XiaoMi/pegasus. The following links provide more
> information about Pegasus in open source:
>
> - Pegasus Website: https://pegasus-kv.github.io
> - Codebase at Github: https://github.com/XiaoMi/pegasus
> - Issue Tracking: https://github.com/XiaoMi/pegasus/issues
> - Releases: https://pegasus-kv.github.io/releases
> - Community Guide: https://pegasus-kv.github.io/community
>
> == Initial Source ==
>
> Besides the core codebase, Pegasus also hosts its side projects on Github
> under XiaoMi Group. Specifically, the initial source includes:
>
> Client libraries with different languages:
>
> - Java-Client: https://github.com/XiaoMi/pegasus-java-client
> - Scala-Client: https://github.com/XiaoMi/pegasus-scala-client
> - NodeJS-Client: https://github.com/XiaoMi/pegasus-nodejs-client
> - Go-Client: https://github.com/XiaoMi/pegasus-go-client
> - Python-Client: https://github.com/XiaoMi/pegasus-python-client
>
> Components of Pegasus:
>
> - rDSN: https://github.com/XiaoMi/rdsn
> - RocksDB: https://github.com/XiaoMi/pegasus-rocksdb
>
> rDSN was initially a distributed framework developed by Zhenyu Guo from
> Microsoft, and we have heavily refactored and improved it to make it more
> fit for Pegasus. rDSN is MIT & Apache-2.0 dual-licensed. The code licensed
> Apache-2.0 belongs to XiaoMi and the copyright of MIT-licensed code is
> assigned to Microsoft. It’s in our plan to merge Pegasus and rDSN as one
> project.
>
> RocksDB is a Facebook-developed storage engine. Pegasus added some
> enhancements and modifications that may be incompatible with the original
> implementation. RocksDB is licensed under Apache 2.0 License.
>
> == External Dependencies ==
>
> Pegasus has the following external dependencies.
>
> - RocksDB (Apache)
> - Apache Thrift (Apache Software License v2.0)
> - Boost (Boost Software License)
> - Apache Zookeeper (Apache)
> - Google s2geometry (BSD)
> - Google gflags (BSD)
> - fmtlib (BSD)
> - POCO (Boost Software License)
> - rapidjson (Tencent)
> - libevent (BSD)
> - Google gperftools (BSD)
> - cameron314/concurrentqueue (BSD)
> - cameron314/readerwriterqueue (BSD)
> - XiaoMi/galaxy-fds-sdk-cpp (No License)
> - jupp0r/prometheus-cpp (MIT)
> - curl (The curl license)
> - nlohmann/json (MIT)
> - abseil-cpp (Apache 2.0)
> - antirez/linenoise (BSD-2)
> - antirez/sds (BSD-2)
>
> Build and test dependencies:
>
> - Apache Maven (Apache Software License v2.0)
> - cmake (BSD)
> - Google gtest (Apache Software License v2.0)
>
> == Required Resources ==
>
> === Mailing List ===
>
> There are currently no mailing lists. The usual mailing lists are expected
> to be set up when entering incubation:
>
> - private@pegasus.incubator.apache.org
> - dev@pegasus.incubator.apache.org
> - commits@pegasus.incubator.apache.org
>
> === Git Repositories ===
>
> Upon entering incubation, we want to move the existing repository from
> https://github.com/XiaoMi/pegasus to Apache infrastructure like
> https://github.com/apache/incubator-pegasus.
>
> === Issue Tracking ===
>
> Pegasus currently uses Github to track issues. Would like to continue to
> do so while we discuss migration possibilities with the ASF Infra committee.
>
> === Other Resources ====
>
> The existing code already has unit tests so we will make use of existing
> Apache continuous testing infrastructure. The resulting load should not be
> very large.
>
> == Source and Intellectual Property Submission Plan ==
>
> Most of the current code is Apache 2.0 licensed and the copyright is
> assigned to XiaoMi. If the project enters incubator, XiaoMi will transfer
> the source code & trademark ownership to ASF via a Software Grant Agreement.
>
> But due to historical issues, Pegasus was based on an MIT licensed code
> that was initially written by microsoft/rDSN, which has long been actively
> developed by Pegasus because the original project is unmaintained (modified
> code is licensed under Apache License 2.0). We aren’t sure if we should
> request Microsoft for any Contributor License Agreement (CLA) during the IP
> clearance process.
>
> == Initial Committers ==
>
> - Zuoyan Qin (https://github.com/qinzuoyan, qinzuoyan@xiaomi dot com)
> - Weijie Sun (https://github.com/shengofsun, luckyweijie@gmail dot com)
> - Yuchen He (https://github.com/hycdong, heyuchen@xiaomi dot com)
> - Tao Wu (https://github.com/neverchanje, wutao1@xiaomi dot com)
> - Yingchun Lai (https://github.com/acelyc111, laiyingchun@xiaomi dot com)
> - Wei Huang (https://github.com/vagetablechicken, huangwei5@xiaomi dot
> com)
> - Shuo Jia (https://github.com/Shuo-Jia, jiashuo1@xiaomi dot com)
> - Liwei Zhao (https://github.com/levy5307, zhaoliwei@xiaomi dot com)
> - Liuyang Cai (https://github.com/LoveHeat)
>
> == Affiliations ==
>
> Seven of the initial committers are employees of Xiaomi.
>
> == Sponsors ==
>
> === Champion ===
>
> TODO
>
> === Nominated Mentors ===
>
> TODO
>
> === Sponsoring Entity ===
>
> We are requesting the Incubator to sponsor this project.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> For additional commands, e-mail: general-help@incubator.apache.org
>
>