You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@couchdb.apache.org by "Reddy B." <re...@live.fr> on 2019/07/09 23:07:40 UTC

CouchDb Rewrite/Fork

Hi all,

I've checked the recent discussions and apparently July is the "vision month" lol. Hopefully this email will not saturate the patience of the core team.

We have been thinking about forking/rewriting CouchDb internally for quite some time now, and this idea has reached a degree of maturity such that I'm pretty confident it will materialize at this point. We hesitated between doing our thing internally to then make our big open-sourcing announcement 5-10 years from now when the product is battle tested, and announcing our intentions here today.

However, I realized that good things may happen by providing this feedback, and that providing this type of feedback also is a way of giving back to the community.

The reason for this project is that we have lost confidence in the way the vision of CouchDb aligns with our goals. As far as we are concerned, there are 3 things we loved with CouchDb:

#Map/Reduce

We think that the benefits of Map/Reduce are very underrated. Map/reduce forces developpers to approach problems differently and results in much more efficient and well-thought of application architectures and implementations. This is in addition to the performance benefits since indexes are built in advance in a very predictable manner (with a few well-documented caveats). For this reason, our developers are forbidden from using Mango, and we require them to wrap their head around problems until they are able to solve them in map/reduce mode.

However, we can see that the focus of the CouchDb project is increasingly on Mango, and we have little confidence in the commitment of the project to first-class citizen Map/Reduce support (while this was for us a defining aspect of the identity of CouchDb).

#Complexity of the codebase

An open-source software that is too complex to be tweaked and hacked is for all practical purposes closed-source software. You guys are VERY smart. And by nature a database software system is a non-trivial piece of technology.

Initially we felt confident that the codebase was small enough and clean enough that should we really need to get our hands dirty in an emergency situation, we would be able to do so. Then Mango made the situation a bit blurrier, but we could easily ignore that, especially since we do not use it. However with FoundationDB... this becomes a whole different story.

The domain model of a database is non-trivial by nature, and now FoundationDb will introduce an additional level of abstraction and indirection, and a very serious one. I've been reading the design discussions since the FoundationDb announcement and there are a lot of impedance mistmatches requiring the domain model of CouchDb to be broken up in fictious entities intended to accomodate FoundationDb abstractions and their limitations (I'll back to this point in a moment).

Indirection is also introduced at the business logic level, with additional steps needing to be followed to emulate the desired behavior. All of this is complexity and obfuscation, and to be realistic, if we already struggled with the straight-to-the-point implementation, there is no way we'll be able to navigate (let alone hack), the FoundationDB-based implementation.

#(Apparent) Non-Alignment of FoundationDb with the reasons that made us love CouchDb

FoundationDb introduces limitations regarding transactions, document sizes and another number of critical items. One of the main reasons we use CouchDb is because of the way it allows us to develop applications rapidly and flexibly address all the state storage needs of application layers. CouchDb has you covered if you just want to dump large media file streamed with HTTP range requests while you iterate fast and your userbase is small, and replication allows you to seemless scale by distributing load on clusters in advanced ways without needing to redesign your applications. The user nkosi23 nicely describes some of the new possibilities enabled by CouchDb:

https://github.com/apache/couchdb/pull/1253#issuecomment-507043600

However, the limitations introduced by FoundationDb and the spirit of their project favoring abstraction purity through aggressive constraints, over operational flexibility is the opposite of the reasons we loved CouchDb and believed in it. It is to us pretty clear that the writing is on the wall. We aren't confident in FoundationDb to cover our bases, since covering our bases is explicitly not the goal of their project and their spirit is different from what has made CouchDb unique (ease of use, simple yet powerful and flexible abstractions etc...).

#Lack of commitment to the ideas pioneered

We feel like Couchdb itself undervalues the wealth of what it has brought to the table. For example when it comes to architecting load balancing for all sorts of applications with a single and transparent value store, CouchDb enables things that simply weren't possible before, and people will need time to understand how they can take advantage of them.

Nowadays we can see sed, awk and such be used in pretty clever ways, but it took time for people to incorporate the possibilities enabled by these tools in their thinking process (even though system administration are much easier to deploy than enterprise applications).

I think that CouchDb should have a 10 or 20-year outlook on the paradigm shifts its introduces, there is a need to give more place to faith and less place to data since not every usage will be adopted within 3 years. Sometimes you need to do things because you believe in them and you know you are right and that eventually people will come. But right now, it feels like customer statistics from Cloudant have become the main driver of the project. A balanced probably can be found between aligning with business realities and evangelism realities. I feel IBM guys are totally right to share their insights, but if there are not faith-zealots to counter-balance, then a positive may become a negative.

#What we plan to do

For all these reasons, CouchDb 3 will likely be the last release we will use. What we are about to activate is an effort to rewrite CouchDb to focus on the use case that we think makes CouchDb unique: a one-stop shop for all data storage needs, no matter the type of application and load. This means focusing on, on the one hand on working seamlessly with extremely large attachments and documents of any size, and on the other hand replication features (which goes hand in hand).

We will also seek to resurrect old features such as list views that we think need long-term faith. To make it possible from a bandwidth perspective, we will make a number of radical decisions. The two most important ones may be the following:

- Only map/reduce will be supported. Far from a limitation we see this as a way of life and a different way of thinking about designing line of business applications. Our finding is that a line of business applications never needs SQL style flexibility for the main app is the problem space has been correctly modeled (instead of being Excel in the web browser). When Business Analytics are really needed, the need is always very localized, and it is nowadays easy enough to have an ETL pipeline on a separate instance (especially considering CouchDb filtered replication capabilities).
- Rewrite CouchDb in FSharp.

Rewriting in Fsharp will provide all the benefits of functional programming, while giving us access to a rich ecosystem of libraries, and a great static type checking system. All of this will mean more time to focus on the core features.

This is in a gist pretty much the plan. This is still early stages, and the way we do things, we would typically roll it out internally for a number of years before announcing it to the public. So I think there will likely be a 10-yearish window before you hear about this again.

I simply wanted to provide our feedback as a friendly contribution.

Re: CouchDb Rewrite/Fork

Posted by "Reddy B." <re...@live.fr>.

Hi Jan,

Thank you so much for your comprehensice reply, I have also read your parallel remarks in the IoT thread which also helped understand the current vision better.

I totally understand your points, there are certain aspects that we may have overlooked, misunderstood or not looked into with enough details.

Overlooked with regards to the fact that there may be a will but technical/performance challenges. Misunderstood with regards to the background story with dropped features, and not looked into with enough details with regards to what would be the best path to help move things forward.

At the very least, what you made clear is that the situation is less black and white and more nuanced than it may seem and I am thankful to you for taking the time to explain it.

We remain strongly curious about exploring what would be the best course of action to address our fears and needs, so I think what would be a productive course of action for us would be to on the one hand study the current implementation in more depth, and on the other hand, prototype with our ideas to see the actual value this would bring to the table when practical costs and challenges are factored in (in addition to checking if this is a realistic path).

As a remark, the reason I said F# is not to suggest that this is the best language, or to start the classic holy war that one language would solve every problem, or that it is time to rewrite everything with something trendier. If we go ahead we would use F# because we are a Microsoft shop and we are both very familiar and very invested with the .NET ecosystem. I mentionned F# to echo our maintenance/tweaking concerns (but I probably should have mentionned that we are a Microsoft shop to make this point clearer - you had no way to guess it from my message).

In a gist, the reasoning is: if we are to undertake such a huge effort, let's also bring things to an environment with which we are familiar, especially since adequate tooling exists there. This is for internal reasons, not for absolutism / flavor of the month reasons. I mentioned it to provide you a data point on the technology stacks of your users, but once again I went too fast on that part.

We have definitely no desire to undertake such an effort if it wouldn't be productive, or if there are better ways to address our needs and concerns while at the same time contributing. We are not interested in getting into the database business, we are only looking to hedge our risk and rip productivity dividends as a bonus.

Now the thing is we are curious people and funding isn't necessarily a problem for us. And compared to most companies we have a strong appetite for reinventing the wheel if it makes us more efficient and/or less exposed to risk. So I think what we'll be doing is get our hands dirtier, and then review everything in light of your very important remarks to see what would be the most productive course of actions to invest our efforts, both for the community and for us (considering our needs, but also our comfort zone and realistic abilities).

Because that's really the data point I tried to convey. My message was less about "I think the project is wrongly headed" (which is not what I think, at most I wonder if it is still aligned with our goals and Jan's message was very helpful with this regard), than it was about saying: "here are our limitations, and fears, and the compromise we are exploring to address them. This is purely related to where we come from and not to absolutism but I feel this data point may be useful to the project".

Thanks very much for your answers, i think we'll be busy for a little while exploring options in more depth and this information will be very valuable to help us find the best compromise / way to invest our ressources.

________________________________
De : Jan Lehnardt <ja...@apache.org>
Envoyé : jeudi 11 juillet 2019 11:26
À : dev@couchdb.apache.org
Objet : Re: CouchDb Rewrite/Fork

Hi Reddy,

this is all pretty good feedback, thanks for taking the time to put this down.

> On 10. Jul 2019, at 01:07, Reddy B. <re...@live.fr> wrote:
>
> Hi all,
>
> I've checked the recent discussions and apparently July is the "vision month" lol. Hopefully this email will not saturate the patience of the core team.

I’m announcing my 2030-plan to rewrite CouchDB in bash ;D

> We have been thinking about forking/rewriting CouchDb internally for quite some time now, and this idea has reached a degree of maturity such that I'm pretty confident it will materialize at this point. We hesitated between doing our thing internally to then make our big open-sourcing announcement 5-10 years from now when the product is battle tested, and announcing our intentions here today.
>
> However, I realized that good things may happen by providing this feedback, and that providing this type of feedback also is a way of giving back to the community.
>
> The reason for this project is that we have lost confidence in the way the vision of CouchDb aligns with our goals. As far as we are concerned, there are 3 things we loved with CouchDb:
>
> #Map/Reduce
>
> We think that the benefits of Map/Reduce are very underrated. Map/reduce forces developpers to approach problems differently and results in much more efficient and well-thought of  application architectures and implementations. This is in addition to the performance benefits since indexes are built in advance in a very predictable manner (with a few well-documented caveats). For this reason, our developers are forbidden from using Mango, and we require them to wrap their head around problems until they are able to solve them in map/reduce mode.
>
> However, we can see that the focus of the CouchDb project is increasingly on Mango, and we have little confidence in the commitment of the project to first-class citizen Map/Reduce support (while this was for us a defining aspect of the identity of CouchDb).

Aside from a TBD point for how to support custom reduces in FDB (for which
several folks have at least outlined a theoretical approach), MapReduce isn’t going anywhere. It is the foundation of how we can ensure that Mango remains scalable to the needs we want.

> #Complexity of the codebase
>
> An open-source software that is too complex to be tweaked and hacked is for all practical purposes closed-source software. You guys are VERY smart. And by nature a database software system is a non-trivial piece of technology.
>
> Initially we felt confident that the codebase was small enough and clean enough that should we really need to get our hands dirty in an emergency situation, we would be able to do so. Then Mango made the situation a bit blurrier, but we could easily ignore that, especially since we do not use it. However with FoundationDB... this becomes a whole different story.

I’m not sure how. Mango is literally just a self-contained module in the CouchDB module list. I don’t want to be flippant about this, but I really don’t understand how another folder in here makes any sort of a difference:

> ls src/
b64url                  couch_mrview            ets_lru                 jiffy                   rexi
bear                    couch_peruser           fabric                  ken                     setup
chttpd                  couch_plugins           fauxton                 khash                   smoosh
config                  couch_pse_tests         folsom                  mango                   snappy
couch                   couch_replicator        global_changes          meck                    triq
couch_epi               couch_stats             hqueue                  mem3
couch_event             couch_tests             hyper                   mochiweb
couch_index             ddoc_cache              ibrowse                 proper
couch_log               docs                    ioq                     rebar

With very few changes, you could delete that src/mango/ directory and have
a mangoless couch.

There are some complexities in the CouchDB codebase, but most if not all are in the src/couch/ submodule that is essentially 1.x plus some new stuff, all the new modules outside of src/couch are relatively well-defined and self-contained.

* * *

>
> The domain model of a database is non-trivial by nature, and now FoundationDb will introduce an additional level of abstraction and indirection, and a very serious one. I've been reading the design discussions since the FoundationDb announcement and there are a lot of impedance mistmatches requiring the domain model of CouchDb to be broken up in fictious entities intended to accomodate FoundationDb abstractions and their limitations (I'll back to this point in a moment).
>
> Indirection is also introduced at the business logic level, with additional steps needing to be followed to emulate the desired behavior. All of this is complexity and obfuscation, and to be realistic, if we already struggled with the straight-to-the-point implementation, there is no way we'll be able to navigate (let alone hack), the FoundationDB-based implementation.

No argument here about the additional layer and impedance mismatch. But as outlined above, the FoundationDB change will get rid of all the code that is most gnarly in CouchDB today and replace it with a mature software project. The modules on top are going to very lightweight, in coparison.

>
> #(Apparent) Non-Alignment of FoundationDb with the reasons that made us love CouchDb
>
> FoundationDb introduces limitations regarding transactions, document sizes and another number of critical items. One of the main reasons we use CouchDb is because of the way it allows us to develop applications rapidly and flexibly address all the state storage needs of application layers. CouchDb has you covered if you just want to dump large media file streamed with HTTP range requests while you iterate fast and your userbase is small, and replication allows you to seemless scale by distributing load on clusters in advanced ways without needing to redesign your applications. The user nkosi23 nicely describes some of the new possibilities enabled by CouchDb:
>
> https://github.com/apache/couchdb/pull/1253#issuecomment-507043600
>
> However, the limitations introduced by FoundationDb and the spirit of their project favoring abstraction purity through aggressive constraints, over operational flexibility is the opposite of the reasons we loved CouchDb and believed in it. It is to us pretty clear that the writing is on the wall. We aren't confident in FoundationDb to cover our bases, since covering our bases is explicitly not the goal of their project and their spirit is different from what has made CouchDb unique (ease of use, simple yet powerful and flexible abstractions etc...).

As Alex points out, we can talk about all this. But this line of reasoning also conveniently ignores another reality. Yes, while CouchDB allows you to store multi MB JSON docs, and GB’s worth of attachments, it will require a lot of computing resources if you are making a lot of use out of this. Disproportionally so, in comparison with other solution, e.g. an abstraction layer that allows you to have smaller docs and that has binary storage outside of CouchDB).

On the flip-side, one of CouchDB’s goals, and main attractions is this: it grows with your needs, you don’t have to rewrite your app as usage grows. You should be able to go from a single-node CouchDB to a three-node cluster, to a ten-node cluster and however far your business or project takes you.

Today, CouchDB can’t fulfil this, because it doesn’t have limits that are similar to the FDB-imposed ones, especially around document and attachment sizes. So the question is this: regardless of FDB, should CouchDB impose limits on resources to ensure its original vision about scaling, or should it bend over backwards to support things that no sensible database should support (I jest, of course)?

As my dayjob includes making people successful with their CouchDB installations, I’m happy to continue to charge hourly for telling them to make their docs smaller, but really, I’d like for folks to be successful on their own, because that means we’ll get more users.

I have some concerns about transaction lengths and getting consistent snapshots out of an FDB-Couch with a one-shot _changes request, as we support today, but I hear that with the new storage engine coming in that at least becomes a little easier to consider.

All that said, for the people who truly can’t move to a FDB-CouchDB world, we are currently working hard on making CouchDB 3.x the absolute best it can be, so that it can be used for a long time going. If a significant part of this community finds itself staying on 3.x, I can promise, with enough contributions, we can even support it for a long time going forward. But we won’t be able to, if we don’t get enough folks pitching in.

>
> #Lack of commitment to the ideas pioneered
>
> We feel like Couchdb itself undervalues the wealth of what it has brought to the table. For example when it comes to architecting load balancing for all sorts of applications with a single and transparent value store, CouchDb enables things that simply weren't possible before, and people will need time to understand how they can take advantage of them.
>
> Nowadays we can see sed, awk and such be used in pretty clever ways, but it took time for people to incorporate the possibilities enabled by these tools in their thinking process (even though system administration are much easier to deploy than enterprise applications).

I don’t really understand what you are referring to here. What exactly did CouchDB pioneer that we are throwing away?

> I think that CouchDb should have a 10 or 20-year outlook on the paradigm shifts its introduces, there is a need to give more place to faith and less place to data since not every usage will be adopted within 3 years.

The ones that we are consciously shedding have been a relative dud since ~2012 (CouchApps). We’ve tried numerous times to rally the remaining enthusiasts around providing a modern implementation of that, the sorry history of which you can read up on in the couchapp@ mailing list. If there were end-users interested in this, that would lead to more folks wanting to build and maintain a modern version of CouchApps, that we’d happily support as part of CouchDB, but despite rallying a number of times in the past 8 years, it hasn’t come together.

I understand that some enthusiasts are extremely excited about CouchApps, trust me, I’ve been one of them. And I agree that we pioneered a number of things that are the new normal. I can trad straight line from CouchDB to Node.js, Docker, Kubernetes and WASM, and while it might not all be connected perfectly, we’ve been pointing at where we are today way before anyone else. But that also means, we didn’t end up being the ones getting big with this. We started at a time when JS was still considered icky, not a stape of modern backend development. We were the first JSON/HTTP database, etc. We didn’t have the resources to keep up with that world, so we are focussing on the things we now we can achieve, and that leads to some tough decisions that we made some time around 2014, and communicated here repeatedly. I’m done arguing this point until someone comes with working code and commitment.

> Sometimes you need to do things because you believe in them and you know you are right and that eventually people will come. But right now, it feels like customer statistics from Cloudant have become the main driver of the project. A balanced probably can be found between aligning with business realities and evangelism realities. I feel IBM guys are totally right to share their insights, but if there are not faith-zealots to counter-balance, then a positive may become a negative.

IBM’s influence here is undeniable, but I’m not seeing one instance of anything being negative. My job and the folks working at Neighbourhoodie (including Wohali) is different enough from what IBM is doing that I have high hopes that we bring important perspectives to the project that forces us to find middle ground. I’d love more perspectives, but “let’s please stay in 2007” is not going to work.

Best
Jan
—

>
> #What we plan to do
>
> For all these reasons, CouchDb 3 will likely be the last release we will use. What we are about to activate is an effort to rewrite CouchDb to focus on the use case that we think makes CouchDb unique: a one-stop shop for all data storage needs, no matter the type of application and load. This means focusing on, on the one hand on working seamlessly with extremely large attachments and documents of any size, and on the other hand replication features (which goes hand in hand).
>
> We will also seek to resurrect old features such as list views that we think need long-term faith. To make it possible from a bandwidth perspective, we will make a number of radical decisions. The two most important ones may be the following:
>
> - Only map/reduce will be supported. Far from a limitation we see this as a way of life and a different way of thinking about designing line of business applications. Our finding is that a line of business applications never needs SQL style flexibility for the main app is the problem space has been correctly modeled (instead of being Excel in the web browser). When Business Analytics are really needed, the need is always very localized, and it is nowadays easy enough to have an ETL pipeline on a separate instance (especially considering CouchDb filtered replication capabilities).
> - Rewrite CouchDb in FSharp.
>
> Rewriting in Fsharp will provide all the benefits of functional programming, while giving us access to a rich ecosystem of libraries, and a great static type checking system. All of this will mean more time to focus on the core features.
>
> This is in a gist pretty much the plan. This is still early stages, and the way we do things, we would typically roll it out internally for a number of years before announcing it to the public. So I think there will likely be a 10-yearish window before you hear about this again.
>
> I simply wanted to provide our feedback as a friendly contribution.

--
Professional Support for Apache CouchDB:
https://neighbourhood.ie/couchdb-support/

Re: CouchDb Rewrite/Fork

Posted by Naomi S <no...@tumbolia.org>.

On Thu, 11 Jul 2019 at 11:26, Jan Lehnardt <ja...@apache.org> wrote:

> Hi Reddy,
>
> this is all pretty good feedback, thanks for taking the time to put this
> down.
>
>
> > On 10. Jul 2019, at 01:07, Reddy B. <re...@live.fr> wrote:
> >
> > Hi all,
> >
> > I've checked the recent discussions and apparently July is the "vision
> month" lol. Hopefully this email will not saturate the patience of the core
> team.
>
> I’m announcing my 2030-plan to rewrite CouchDB in bash ;D
>

I'll beat you with my rewrite in GNU Make :P

Re: CouchDb Rewrite/Fork

Posted by Jan Lehnardt <ja...@apache.org>.

Hi Reddy,

this is all pretty good feedback, thanks for taking the time to put this down.

> On 10. Jul 2019, at 01:07, Reddy B. <re...@live.fr> wrote:
> 
> Hi all,
> 
> I've checked the recent discussions and apparently July is the "vision month" lol. Hopefully this email will not saturate the patience of the core team.

I’m announcing my 2030-plan to rewrite CouchDB in bash ;D

> We have been thinking about forking/rewriting CouchDb internally for quite some time now, and this idea has reached a degree of maturity such that I'm pretty confident it will materialize at this point. We hesitated between doing our thing internally to then make our big open-sourcing announcement 5-10 years from now when the product is battle tested, and announcing our intentions here today.
> 
> However, I realized that good things may happen by providing this feedback, and that providing this type of feedback also is a way of giving back to the community.
> 
> The reason for this project is that we have lost confidence in the way the vision of CouchDb aligns with our goals. As far as we are concerned, there are 3 things we loved with CouchDb:
> 
> #Map/Reduce
> 
> We think that the benefits of Map/Reduce are very underrated. Map/reduce forces developpers to approach problems differently and results in much more efficient and well-thought of  application architectures and implementations. This is in addition to the performance benefits since indexes are built in advance in a very predictable manner (with a few well-documented caveats). For this reason, our developers are forbidden from using Mango, and we require them to wrap their head around problems until they are able to solve them in map/reduce mode.
> 
> However, we can see that the focus of the CouchDb project is increasingly on Mango, and we have little confidence in the commitment of the project to first-class citizen Map/Reduce support (while this was for us a defining aspect of the identity of CouchDb).

Aside from a TBD point for how to support custom reduces in FDB (for which
several folks have at least outlined a theoretical approach), MapReduce isn’t going anywhere. It is the foundation of how we can ensure that Mango remains scalable to the needs we want.

> #Complexity of the codebase
> 
> An open-source software that is too complex to be tweaked and hacked is for all practical purposes closed-source software. You guys are VERY smart. And by nature a database software system is a non-trivial piece of technology.
> 
> Initially we felt confident that the codebase was small enough and clean enough that should we really need to get our hands dirty in an emergency situation, we would be able to do so. Then Mango made the situation a bit blurrier, but we could easily ignore that, especially since we do not use it. However with FoundationDB... this becomes a whole different story.

I’m not sure how. Mango is literally just a self-contained module in the CouchDB module list. I don’t want to be flippant about this, but I really don’t understand how another folder in here makes any sort of a difference:

> ls src/
b64url			couch_mrview		ets_lru			jiffy			rexi
bear			couch_peruser		fabric			ken			setup
chttpd			couch_plugins		fauxton			khash			smoosh
config			couch_pse_tests		folsom			mango			snappy
couch			couch_replicator	global_changes		meck			triq
couch_epi		couch_stats		hqueue			mem3
couch_event		couch_tests		hyper			mochiweb
couch_index		ddoc_cache		ibrowse			proper
couch_log		docs			ioq			rebar

With very few changes, you could delete that src/mango/ directory and have
a mangoless couch.

There are some complexities in the CouchDB codebase, but most if not all are in the src/couch/ submodule that is essentially 1.x plus some new stuff, all the new modules outside of src/couch are relatively well-defined and self-contained.

* * *

> 
> The domain model of a database is non-trivial by nature, and now FoundationDb will introduce an additional level of abstraction and indirection, and a very serious one. I've been reading the design discussions since the FoundationDb announcement and there are a lot of impedance mistmatches requiring the domain model of CouchDb to be broken up in fictious entities intended to accomodate FoundationDb abstractions and their limitations (I'll back to this point in a moment).
> 
> Indirection is also introduced at the business logic level, with additional steps needing to be followed to emulate the desired behavior. All of this is complexity and obfuscation, and to be realistic, if we already struggled with the straight-to-the-point implementation, there is no way we'll be able to navigate (let alone hack), the FoundationDB-based implementation.

No argument here about the additional layer and impedance mismatch. But as outlined above, the FoundationDB change will get rid of all the code that is most gnarly in CouchDB today and replace it with a mature software project. The modules on top are going to very lightweight, in coparison.

> 
> #(Apparent) Non-Alignment of FoundationDb with the reasons that made us love CouchDb
> 
> FoundationDb introduces limitations regarding transactions, document sizes and another number of critical items. One of the main reasons we use CouchDb is because of the way it allows us to develop applications rapidly and flexibly address all the state storage needs of application layers. CouchDb has you covered if you just want to dump large media file streamed with HTTP range requests while you iterate fast and your userbase is small, and replication allows you to seemless scale by distributing load on clusters in advanced ways without needing to redesign your applications. The user nkosi23 nicely describes some of the new possibilities enabled by CouchDb:
> 
> https://github.com/apache/couchdb/pull/1253#issuecomment-507043600
> 
> However, the limitations introduced by FoundationDb and the spirit of their project favoring abstraction purity through aggressive constraints, over operational flexibility is the opposite of the reasons we loved CouchDb and believed in it. It is to us pretty clear that the writing is on the wall. We aren't confident in FoundationDb to cover our bases, since covering our bases is explicitly not the goal of their project and their spirit is different from what has made CouchDb unique (ease of use, simple yet powerful and flexible abstractions etc...).

As Alex points out, we can talk about all this. But this line of reasoning also conveniently ignores another reality. Yes, while CouchDB allows you to store multi MB JSON docs, and GB’s worth of attachments, it will require a lot of computing resources if you are making a lot of use out of this. Disproportionally so, in comparison with other solution, e.g. an abstraction layer that allows you to have smaller docs and that has binary storage outside of CouchDB). 

On the flip-side, one of CouchDB’s goals, and main attractions is this: it grows with your needs, you don’t have to rewrite your app as usage grows. You should be able to go from a single-node CouchDB to a three-node cluster, to a ten-node cluster and however far your business or project takes you.

Today, CouchDB can’t fulfil this, because it doesn’t have limits that are similar to the FDB-imposed ones, especially around document and attachment sizes. So the question is this: regardless of FDB, should CouchDB impose limits on resources to ensure its original vision about scaling, or should it bend over backwards to support things that no sensible database should support (I jest, of course)?

As my dayjob includes making people successful with their CouchDB installations, I’m happy to continue to charge hourly for telling them to make their docs smaller, but really, I’d like for folks to be successful on their own, because that means we’ll get more users.

I have some concerns about transaction lengths and getting consistent snapshots out of an FDB-Couch with a one-shot _changes request, as we support today, but I hear that with the new storage engine coming in that at least becomes a little easier to consider.

All that said, for the people who truly can’t move to a FDB-CouchDB world, we are currently working hard on making CouchDB 3.x the absolute best it can be, so that it can be used for a long time going. If a significant part of this community finds itself staying on 3.x, I can promise, with enough contributions, we can even support it for a long time going forward. But we won’t be able to, if we don’t get enough folks pitching in.

> 
> #Lack of commitment to the ideas pioneered
> 
> We feel like Couchdb itself undervalues the wealth of what it has brought to the table. For example when it comes to architecting load balancing for all sorts of applications with a single and transparent value store, CouchDb enables things that simply weren't possible before, and people will need time to understand how they can take advantage of them.
> 
> Nowadays we can see sed, awk and such be used in pretty clever ways, but it took time for people to incorporate the possibilities enabled by these tools in their thinking process (even though system administration are much easier to deploy than enterprise applications).

I don’t really understand what you are referring to here. What exactly did CouchDB pioneer that we are throwing away?

> I think that CouchDb should have a 10 or 20-year outlook on the paradigm shifts its introduces, there is a need to give more place to faith and less place to data since not every usage will be adopted within 3 years.

The ones that we are consciously shedding have been a relative dud since ~2012 (CouchApps). We’ve tried numerous times to rally the remaining enthusiasts around providing a modern implementation of that, the sorry history of which you can read up on in the couchapp@ mailing list. If there were end-users interested in this, that would lead to more folks wanting to build and maintain a modern version of CouchApps, that we’d happily support as part of CouchDB, but despite rallying a number of times in the past 8 years, it hasn’t come together.

I understand that some enthusiasts are extremely excited about CouchApps, trust me, I’ve been one of them. And I agree that we pioneered a number of things that are the new normal. I can trad straight line from CouchDB to Node.js, Docker, Kubernetes and WASM, and while it might not all be connected perfectly, we’ve been pointing at where we are today way before anyone else. But that also means, we didn’t end up being the ones getting big with this. We started at a time when JS was still considered icky, not a stape of modern backend development. We were the first JSON/HTTP database, etc. We didn’t have the resources to keep up with that world, so we are focussing on the things we now we can achieve, and that leads to some tough decisions that we made some time around 2014, and communicated here repeatedly. I’m done arguing this point until someone comes with working code and commitment.

> Sometimes you need to do things because you believe in them and you know you are right and that eventually people will come. But right now, it feels like customer statistics from Cloudant have become the main driver of the project. A balanced probably can be found between aligning with business realities and evangelism realities. I feel IBM guys are totally right to share their insights, but if there are not faith-zealots to counter-balance, then a positive may become a negative.

IBM’s influence here is undeniable, but I’m not seeing one instance of anything being negative. My job and the folks working at Neighbourhoodie (including Wohali) is different enough from what IBM is doing that I have high hopes that we bring important perspectives to the project that forces us to find middle ground. I’d love more perspectives, but “let’s please stay in 2007” is not going to work.

Best
Jan
—

> 
> #What we plan to do
> 
> For all these reasons, CouchDb 3 will likely be the last release we will use. What we are about to activate is an effort to rewrite CouchDb to focus on the use case that we think makes CouchDb unique: a one-stop shop for all data storage needs, no matter the type of application and load. This means focusing on, on the one hand on working seamlessly with extremely large attachments and documents of any size, and on the other hand replication features (which goes hand in hand).
> 
> We will also seek to resurrect old features such as list views that we think need long-term faith. To make it possible from a bandwidth perspective, we will make a number of radical decisions. The two most important ones may be the following:
> 
> - Only map/reduce will be supported. Far from a limitation we see this as a way of life and a different way of thinking about designing line of business applications. Our finding is that a line of business applications never needs SQL style flexibility for the main app is the problem space has been correctly modeled (instead of being Excel in the web browser). When Business Analytics are really needed, the need is always very localized, and it is nowadays easy enough to have an ETL pipeline on a separate instance (especially considering CouchDb filtered replication capabilities).
> - Rewrite CouchDb in FSharp.
> 
> Rewriting in Fsharp will provide all the benefits of functional programming, while giving us access to a rich ecosystem of libraries, and a great static type checking system. All of this will mean more time to focus on the core features.
> 
> This is in a gist pretty much the plan. This is still early stages, and the way we do things, we would typically roll it out internally for a number of years before announcing it to the public. So I think there will likely be a 10-yearish window before you hear about this again.
> 
> I simply wanted to provide our feedback as a friendly contribution.

-- 
Professional Support for Apache CouchDB:
https://neighbourhood.ie/couchdb-support/

Re: CouchDb Rewrite/Fork

Posted by Joan Touzet <wo...@apache.org>.

On 2019-07-10 18:15, Johs Ensby wrote:
> Reddy and Joan,
> 
>> On 10 Jul 2019, at 23:16, Reddy B. <re...@live.fr> wrote:
>> Thanks Joan,  keeping API+ replication protocal compatibility is definitely the plan
> 
> New forks/rewrites with API and replication protocal* compatibility would be great news for the user community.
> *) Fail-proof master-to-master replication of documents with multiple large attachments.

Specific proposals to improve the replication protocol are welcome
through the new RFC process. Sample code demonstrating scaleability of
the approach, as well as backwards compatibility, would be important to
its acceptance.

-Joan

Re: CouchDb Rewrite/Fork

Posted by Johs Ensby <jo...@b2w.com>.

Reddy and Joan,

> On 10 Jul 2019, at 23:16, Reddy B. <re...@live.fr> wrote:
> Thanks Joan,  keeping API+ replication protocal compatibility is definitely the plan

New forks/rewrites with API and replication protocal* compatibility would be great news for the user community.
*) Fail-proof master-to-master replication of documents with multiple large attachments.

johs

Re: CouchDb Rewrite/Fork

Posted by "Reddy B." <re...@live.fr>.

Thanks Joan,  keeping API+ replication protocal compatibility is definitely the plan (especially since we'll need to migrate existing applications internally). Sounds great, we'll keep it in mind thank you
________________________________
De : Joan Touzet <wo...@apache.org>
Envoyé : mercredi 10 juillet 2019 23:00
À : dev@couchdb.apache.org
Objet : Re: CouchDb Rewrite/Fork

Sounds like a challenging and intense project!

If you retain compatibility with the CouchDB replication protocol, we'd
certainly be willing to include it in our list of CouchDB ecosystem
partners, once your product is publicly available.

-Joan

On 2019-07-10 10:04, Reddy B. wrote:
> Of course, this will have nothing to do in terms of branding, I don't even think we'll use the codebase. Moreover our primarily goal isn't to offer a competing product to the public. It is to serve our internal needs and reduce our risk. We will only open-source it as a way to give back once the product is very mature (which is also a way to reduce support needs).
>
> Thanks
> ________________________________
> De : Robert Newson <rn...@apache.org>
> Envoyé : mercredi 10 juillet 2019 15:47
> À : dev@couchdb.apache.org
> Objet : Re: CouchDb Rewrite/Fork
>
> That’s valuable feedback thank you.
>
> Best of luck with your new project and a gentle reminder that you may not call it CouchDB.
>
> B.
>
>> On 10 Jul 2019, at 00:07, Reddy B. <re...@live.fr> wrote:
>>
>> Hi all,
>>
>> I've checked the recent discussions and apparently July is the "vision month" lol. Hopefully this email will not saturate the patience of the core team.
>>
>> We have been thinking about forking/rewriting CouchDb internally for quite some time now, and this idea has reached a degree of maturity such that I'm pretty confident it will materialize at this point. We hesitated between doing our thing internally to then make our big open-sourcing announcement 5-10 years from now when the product is battle tested, and announcing our intentions here today.
>>
>> However, I realized that good things may happen by providing this feedback, and that providing this type of feedback also is a way of giving back to the community.
>>
>> The reason for this project is that we have lost confidence in the way the vision of CouchDb aligns with our goals. As far as we are concerned, there are 3 things we loved with CouchDb:
>>
>> #Map/Reduce
>>
>> We think that the benefits of Map/Reduce are very underrated. Map/reduce forces developpers to approach problems differently and results in much more efficient and well-thought of  application architectures and implementations. This is in addition to the performance benefits since indexes are built in advance in a very predictable manner (with a few well-documented caveats). For this reason, our developers are forbidden from using Mango, and we require them to wrap their head around problems until they are able to solve them in map/reduce mode.
>>
>> However, we can see that the focus of the CouchDb project is increasingly on Mango, and we have little confidence in the commitment of the project to first-class citizen Map/Reduce support (while this was for us a defining aspect of the identity of CouchDb).
>>
>> #Complexity of the codebase
>>
>> An open-source software that is too complex to be tweaked and hacked is for all practical purposes closed-source software. You guys are VERY smart. And by nature a database software system is a non-trivial piece of technology.
>>
>> Initially we felt confident that the codebase was small enough and clean enough that should we really need to get our hands dirty in an emergency situation, we would be able to do so. Then Mango made the situation a bit blurrier, but we could easily ignore that, especially since we do not use it. However with FoundationDB... this becomes a whole different story.
>>
>> The domain model of a database is non-trivial by nature, and now FoundationDb will introduce an additional level of abstraction and indirection, and a very serious one. I've been reading the design discussions since the FoundationDb announcement and there are a lot of impedance mistmatches requiring the domain model of CouchDb to be broken up in fictious entities intended to accomodate FoundationDb abstractions and their limitations (I'll back to this point in a moment).
>>
>> Indirection is also introduced at the business logic level, with additional steps needing to be followed to emulate the desired behavior. All of this is complexity and obfuscation, and to be realistic, if we already struggled with the straight-to-the-point implementation, there is no way we'll be able to navigate (let alone hack), the FoundationDB-based implementation.
>>
>> #(Apparent) Non-Alignment of FoundationDb with the reasons that made us love CouchDb
>>
>> FoundationDb introduces limitations regarding transactions, document sizes and another number of critical items. One of the main reasons we use CouchDb is because of the way it allows us to develop applications rapidly and flexibly address all the state storage needs of application layers. CouchDb has you covered if you just want to dump large media file streamed with HTTP range requests while you iterate fast and your userbase is small, and replication allows you to seemless scale by distributing load on clusters in advanced ways without needing to redesign your applications. The user nkosi23 nicely describes some of the new possibilities enabled by CouchDb:
>>
>> https://github.com/apache/couchdb/pull/1253#issuecomment-507043600
>>
>> However, the limitations introduced by FoundationDb and the spirit of their project favoring abstraction purity through aggressive constraints, over operational flexibility is the opposite of the reasons we loved CouchDb and believed in it. It is to us pretty clear that the writing is on the wall. We aren't confident in FoundationDb to cover our bases, since covering our bases is explicitly not the goal of their project and their spirit is different from what has made CouchDb unique (ease of use, simple yet powerful and flexible abstractions etc...).
>>
>> #Lack of commitment to the ideas pioneered
>>
>> We feel like Couchdb itself undervalues the wealth of what it has brought to the table. For example when it comes to architecting load balancing for all sorts of applications with a single and transparent value store, CouchDb enables things that simply weren't possible before, and people will need time to understand how they can take advantage of them.
>>
>> Nowadays we can see sed, awk and such be used in pretty clever ways, but it took time for people to incorporate the possibilities enabled by these tools in their thinking process (even though system administration are much easier to deploy than enterprise applications).
>>
>> I think that CouchDb should have a 10 or 20-year outlook on the paradigm shifts its introduces, there is a need to give more place to faith and less place to data since not every usage will be adopted within 3 years. Sometimes you need to do things because you believe in them and you know you are right and that eventually people will come. But right now, it feels like customer statistics from Cloudant have become the main driver of the project. A balanced probably can be found between aligning with business realities and evangelism realities. I feel IBM guys are totally right to share their insights, but if there are not faith-zealots to counter-balance, then a positive may become a negative.
>>
>> #What we plan to do
>>
>> For all these reasons, CouchDb 3 will likely be the last release we will use. What we are about to activate is an effort to rewrite CouchDb to focus on the use case that we think makes CouchDb unique: a one-stop shop for all data storage needs, no matter the type of application and load. This means focusing on, on the one hand on working seamlessly with extremely large attachments and documents of any size, and on the other hand replication features (which goes hand in hand).
>>
>> We will also seek to resurrect old features such as list views that we think need long-term faith. To make it possible from a bandwidth perspective, we will make a number of radical decisions. The two most important ones may be the following:
>>
>> - Only map/reduce will be supported. Far from a limitation we see this as a way of life and a different way of thinking about designing line of business applications. Our finding is that a line of business applications never needs SQL style flexibility for the main app is the problem space has been correctly modeled (instead of being Excel in the web browser). When Business Analytics are really needed, the need is always very localized, and it is nowadays easy enough to have an ETL pipeline on a separate instance (especially considering CouchDb filtered replication capabilities).
>> - Rewrite CouchDb in FSharp.
>>
>> Rewriting in Fsharp will provide all the benefits of functional programming, while giving us access to a rich ecosystem of libraries, and a great static type checking system. All of this will mean more time to focus on the core features.
>>
>> This is in a gist pretty much the plan. This is still early stages, and the way we do things, we would typically roll it out internally for a number of years before announcing it to the public. So I think there will likely be a 10-yearish window before you hear about this again.
>>
>> I simply wanted to provide our feedback as a friendly contribution.
>
>

Re: CouchDb Rewrite/Fork

Posted by Joan Touzet <wo...@apache.org>.

Sounds like a challenging and intense project!

If you retain compatibility with the CouchDB replication protocol, we'd
certainly be willing to include it in our list of CouchDB ecosystem
partners, once your product is publicly available.

-Joan

On 2019-07-10 10:04, Reddy B. wrote:
> Of course, this will have nothing to do in terms of branding, I don't even think we'll use the codebase. Moreover our primarily goal isn't to offer a competing product to the public. It is to serve our internal needs and reduce our risk. We will only open-source it as a way to give back once the product is very mature (which is also a way to reduce support needs).
> 
> Thanks
> ________________________________
> De : Robert Newson <rn...@apache.org>
> Envoyé : mercredi 10 juillet 2019 15:47
> À : dev@couchdb.apache.org
> Objet : Re: CouchDb Rewrite/Fork
> 
> That’s valuable feedback thank you.
> 
> Best of luck with your new project and a gentle reminder that you may not call it CouchDB.
> 
> B.
> 
>> On 10 Jul 2019, at 00:07, Reddy B. <re...@live.fr> wrote:
>>
>> Hi all,
>>
>> I've checked the recent discussions and apparently July is the "vision month" lol. Hopefully this email will not saturate the patience of the core team.
>>
>> We have been thinking about forking/rewriting CouchDb internally for quite some time now, and this idea has reached a degree of maturity such that I'm pretty confident it will materialize at this point. We hesitated between doing our thing internally to then make our big open-sourcing announcement 5-10 years from now when the product is battle tested, and announcing our intentions here today.
>>
>> However, I realized that good things may happen by providing this feedback, and that providing this type of feedback also is a way of giving back to the community.
>>
>> The reason for this project is that we have lost confidence in the way the vision of CouchDb aligns with our goals. As far as we are concerned, there are 3 things we loved with CouchDb:
>>
>> #Map/Reduce
>>
>> We think that the benefits of Map/Reduce are very underrated. Map/reduce forces developpers to approach problems differently and results in much more efficient and well-thought of  application architectures and implementations. This is in addition to the performance benefits since indexes are built in advance in a very predictable manner (with a few well-documented caveats). For this reason, our developers are forbidden from using Mango, and we require them to wrap their head around problems until they are able to solve them in map/reduce mode.
>>
>> However, we can see that the focus of the CouchDb project is increasingly on Mango, and we have little confidence in the commitment of the project to first-class citizen Map/Reduce support (while this was for us a defining aspect of the identity of CouchDb).
>>
>> #Complexity of the codebase
>>
>> An open-source software that is too complex to be tweaked and hacked is for all practical purposes closed-source software. You guys are VERY smart. And by nature a database software system is a non-trivial piece of technology.
>>
>> Initially we felt confident that the codebase was small enough and clean enough that should we really need to get our hands dirty in an emergency situation, we would be able to do so. Then Mango made the situation a bit blurrier, but we could easily ignore that, especially since we do not use it. However with FoundationDB... this becomes a whole different story.
>>
>> The domain model of a database is non-trivial by nature, and now FoundationDb will introduce an additional level of abstraction and indirection, and a very serious one. I've been reading the design discussions since the FoundationDb announcement and there are a lot of impedance mistmatches requiring the domain model of CouchDb to be broken up in fictious entities intended to accomodate FoundationDb abstractions and their limitations (I'll back to this point in a moment).
>>
>> Indirection is also introduced at the business logic level, with additional steps needing to be followed to emulate the desired behavior. All of this is complexity and obfuscation, and to be realistic, if we already struggled with the straight-to-the-point implementation, there is no way we'll be able to navigate (let alone hack), the FoundationDB-based implementation.
>>
>> #(Apparent) Non-Alignment of FoundationDb with the reasons that made us love CouchDb
>>
>> FoundationDb introduces limitations regarding transactions, document sizes and another number of critical items. One of the main reasons we use CouchDb is because of the way it allows us to develop applications rapidly and flexibly address all the state storage needs of application layers. CouchDb has you covered if you just want to dump large media file streamed with HTTP range requests while you iterate fast and your userbase is small, and replication allows you to seemless scale by distributing load on clusters in advanced ways without needing to redesign your applications. The user nkosi23 nicely describes some of the new possibilities enabled by CouchDb:
>>
>> https://github.com/apache/couchdb/pull/1253#issuecomment-507043600
>>
>> However, the limitations introduced by FoundationDb and the spirit of their project favoring abstraction purity through aggressive constraints, over operational flexibility is the opposite of the reasons we loved CouchDb and believed in it. It is to us pretty clear that the writing is on the wall. We aren't confident in FoundationDb to cover our bases, since covering our bases is explicitly not the goal of their project and their spirit is different from what has made CouchDb unique (ease of use, simple yet powerful and flexible abstractions etc...).
>>
>> #Lack of commitment to the ideas pioneered
>>
>> We feel like Couchdb itself undervalues the wealth of what it has brought to the table. For example when it comes to architecting load balancing for all sorts of applications with a single and transparent value store, CouchDb enables things that simply weren't possible before, and people will need time to understand how they can take advantage of them.
>>
>> Nowadays we can see sed, awk and such be used in pretty clever ways, but it took time for people to incorporate the possibilities enabled by these tools in their thinking process (even though system administration are much easier to deploy than enterprise applications).
>>
>> I think that CouchDb should have a 10 or 20-year outlook on the paradigm shifts its introduces, there is a need to give more place to faith and less place to data since not every usage will be adopted within 3 years. Sometimes you need to do things because you believe in them and you know you are right and that eventually people will come. But right now, it feels like customer statistics from Cloudant have become the main driver of the project. A balanced probably can be found between aligning with business realities and evangelism realities. I feel IBM guys are totally right to share their insights, but if there are not faith-zealots to counter-balance, then a positive may become a negative.
>>
>> #What we plan to do
>>
>> For all these reasons, CouchDb 3 will likely be the last release we will use. What we are about to activate is an effort to rewrite CouchDb to focus on the use case that we think makes CouchDb unique: a one-stop shop for all data storage needs, no matter the type of application and load. This means focusing on, on the one hand on working seamlessly with extremely large attachments and documents of any size, and on the other hand replication features (which goes hand in hand).
>>
>> We will also seek to resurrect old features such as list views that we think need long-term faith. To make it possible from a bandwidth perspective, we will make a number of radical decisions. The two most important ones may be the following:
>>
>> - Only map/reduce will be supported. Far from a limitation we see this as a way of life and a different way of thinking about designing line of business applications. Our finding is that a line of business applications never needs SQL style flexibility for the main app is the problem space has been correctly modeled (instead of being Excel in the web browser). When Business Analytics are really needed, the need is always very localized, and it is nowadays easy enough to have an ETL pipeline on a separate instance (especially considering CouchDb filtered replication capabilities).
>> - Rewrite CouchDb in FSharp.
>>
>> Rewriting in Fsharp will provide all the benefits of functional programming, while giving us access to a rich ecosystem of libraries, and a great static type checking system. All of this will mean more time to focus on the core features.
>>
>> This is in a gist pretty much the plan. This is still early stages, and the way we do things, we would typically roll it out internally for a number of years before announcing it to the public. So I think there will likely be a 10-yearish window before you hear about this again.
>>
>> I simply wanted to provide our feedback as a friendly contribution.
> 
>

Re: CouchDb Rewrite/Fork

Posted by "Reddy B." <re...@live.fr>.

Of course, this will have nothing to do in terms of branding, I don't even think we'll use the codebase. Moreover our primarily goal isn't to offer a competing product to the public. It is to serve our internal needs and reduce our risk. We will only open-source it as a way to give back once the product is very mature (which is also a way to reduce support needs).

Thanks
________________________________
De : Robert Newson <rn...@apache.org>
Envoyé : mercredi 10 juillet 2019 15:47
À : dev@couchdb.apache.org
Objet : Re: CouchDb Rewrite/Fork

That’s valuable feedback thank you.

Best of luck with your new project and a gentle reminder that you may not call it CouchDB.

B.

> On 10 Jul 2019, at 00:07, Reddy B. <re...@live.fr> wrote:
>
> Hi all,
>
> I've checked the recent discussions and apparently July is the "vision month" lol. Hopefully this email will not saturate the patience of the core team.
>
> We have been thinking about forking/rewriting CouchDb internally for quite some time now, and this idea has reached a degree of maturity such that I'm pretty confident it will materialize at this point. We hesitated between doing our thing internally to then make our big open-sourcing announcement 5-10 years from now when the product is battle tested, and announcing our intentions here today.
>
> However, I realized that good things may happen by providing this feedback, and that providing this type of feedback also is a way of giving back to the community.
>
> The reason for this project is that we have lost confidence in the way the vision of CouchDb aligns with our goals. As far as we are concerned, there are 3 things we loved with CouchDb:
>
> #Map/Reduce
>
> We think that the benefits of Map/Reduce are very underrated. Map/reduce forces developpers to approach problems differently and results in much more efficient and well-thought of  application architectures and implementations. This is in addition to the performance benefits since indexes are built in advance in a very predictable manner (with a few well-documented caveats). For this reason, our developers are forbidden from using Mango, and we require them to wrap their head around problems until they are able to solve them in map/reduce mode.
>
> However, we can see that the focus of the CouchDb project is increasingly on Mango, and we have little confidence in the commitment of the project to first-class citizen Map/Reduce support (while this was for us a defining aspect of the identity of CouchDb).
>
> #Complexity of the codebase
>
> An open-source software that is too complex to be tweaked and hacked is for all practical purposes closed-source software. You guys are VERY smart. And by nature a database software system is a non-trivial piece of technology.
>
> Initially we felt confident that the codebase was small enough and clean enough that should we really need to get our hands dirty in an emergency situation, we would be able to do so. Then Mango made the situation a bit blurrier, but we could easily ignore that, especially since we do not use it. However with FoundationDB... this becomes a whole different story.
>
> The domain model of a database is non-trivial by nature, and now FoundationDb will introduce an additional level of abstraction and indirection, and a very serious one. I've been reading the design discussions since the FoundationDb announcement and there are a lot of impedance mistmatches requiring the domain model of CouchDb to be broken up in fictious entities intended to accomodate FoundationDb abstractions and their limitations (I'll back to this point in a moment).
>
> Indirection is also introduced at the business logic level, with additional steps needing to be followed to emulate the desired behavior. All of this is complexity and obfuscation, and to be realistic, if we already struggled with the straight-to-the-point implementation, there is no way we'll be able to navigate (let alone hack), the FoundationDB-based implementation.
>
> #(Apparent) Non-Alignment of FoundationDb with the reasons that made us love CouchDb
>
> FoundationDb introduces limitations regarding transactions, document sizes and another number of critical items. One of the main reasons we use CouchDb is because of the way it allows us to develop applications rapidly and flexibly address all the state storage needs of application layers. CouchDb has you covered if you just want to dump large media file streamed with HTTP range requests while you iterate fast and your userbase is small, and replication allows you to seemless scale by distributing load on clusters in advanced ways without needing to redesign your applications. The user nkosi23 nicely describes some of the new possibilities enabled by CouchDb:
>
> https://github.com/apache/couchdb/pull/1253#issuecomment-507043600
>
> However, the limitations introduced by FoundationDb and the spirit of their project favoring abstraction purity through aggressive constraints, over operational flexibility is the opposite of the reasons we loved CouchDb and believed in it. It is to us pretty clear that the writing is on the wall. We aren't confident in FoundationDb to cover our bases, since covering our bases is explicitly not the goal of their project and their spirit is different from what has made CouchDb unique (ease of use, simple yet powerful and flexible abstractions etc...).
>
> #Lack of commitment to the ideas pioneered
>
> We feel like Couchdb itself undervalues the wealth of what it has brought to the table. For example when it comes to architecting load balancing for all sorts of applications with a single and transparent value store, CouchDb enables things that simply weren't possible before, and people will need time to understand how they can take advantage of them.
>
> Nowadays we can see sed, awk and such be used in pretty clever ways, but it took time for people to incorporate the possibilities enabled by these tools in their thinking process (even though system administration are much easier to deploy than enterprise applications).
>
> I think that CouchDb should have a 10 or 20-year outlook on the paradigm shifts its introduces, there is a need to give more place to faith and less place to data since not every usage will be adopted within 3 years. Sometimes you need to do things because you believe in them and you know you are right and that eventually people will come. But right now, it feels like customer statistics from Cloudant have become the main driver of the project. A balanced probably can be found between aligning with business realities and evangelism realities. I feel IBM guys are totally right to share their insights, but if there are not faith-zealots to counter-balance, then a positive may become a negative.
>
> #What we plan to do
>
> For all these reasons, CouchDb 3 will likely be the last release we will use. What we are about to activate is an effort to rewrite CouchDb to focus on the use case that we think makes CouchDb unique: a one-stop shop for all data storage needs, no matter the type of application and load. This means focusing on, on the one hand on working seamlessly with extremely large attachments and documents of any size, and on the other hand replication features (which goes hand in hand).
>
> We will also seek to resurrect old features such as list views that we think need long-term faith. To make it possible from a bandwidth perspective, we will make a number of radical decisions. The two most important ones may be the following:
>
> - Only map/reduce will be supported. Far from a limitation we see this as a way of life and a different way of thinking about designing line of business applications. Our finding is that a line of business applications never needs SQL style flexibility for the main app is the problem space has been correctly modeled (instead of being Excel in the web browser). When Business Analytics are really needed, the need is always very localized, and it is nowadays easy enough to have an ETL pipeline on a separate instance (especially considering CouchDb filtered replication capabilities).
> - Rewrite CouchDb in FSharp.
>
> Rewriting in Fsharp will provide all the benefits of functional programming, while giving us access to a rich ecosystem of libraries, and a great static type checking system. All of this will mean more time to focus on the core features.
>
> This is in a gist pretty much the plan. This is still early stages, and the way we do things, we would typically roll it out internally for a number of years before announcing it to the public. So I think there will likely be a 10-yearish window before you hear about this again.
>
> I simply wanted to provide our feedback as a friendly contribution.

Re: CouchDb Rewrite/Fork

Posted by Robert Newson <rn...@apache.org>.

That’s valuable feedback thank you. 

Best of luck with your new project and a gentle reminder that you may not call it CouchDB. 

B. 

> On 10 Jul 2019, at 00:07, Reddy B. <re...@live.fr> wrote:
> 
> Hi all,
> 
> I've checked the recent discussions and apparently July is the "vision month" lol. Hopefully this email will not saturate the patience of the core team.
> 
> We have been thinking about forking/rewriting CouchDb internally for quite some time now, and this idea has reached a degree of maturity such that I'm pretty confident it will materialize at this point. We hesitated between doing our thing internally to then make our big open-sourcing announcement 5-10 years from now when the product is battle tested, and announcing our intentions here today.
> 
> However, I realized that good things may happen by providing this feedback, and that providing this type of feedback also is a way of giving back to the community.
> 
> The reason for this project is that we have lost confidence in the way the vision of CouchDb aligns with our goals. As far as we are concerned, there are 3 things we loved with CouchDb:
> 
> #Map/Reduce
> 
> We think that the benefits of Map/Reduce are very underrated. Map/reduce forces developpers to approach problems differently and results in much more efficient and well-thought of  application architectures and implementations. This is in addition to the performance benefits since indexes are built in advance in a very predictable manner (with a few well-documented caveats). For this reason, our developers are forbidden from using Mango, and we require them to wrap their head around problems until they are able to solve them in map/reduce mode.
> 
> However, we can see that the focus of the CouchDb project is increasingly on Mango, and we have little confidence in the commitment of the project to first-class citizen Map/Reduce support (while this was for us a defining aspect of the identity of CouchDb).
> 
> #Complexity of the codebase
> 
> An open-source software that is too complex to be tweaked and hacked is for all practical purposes closed-source software. You guys are VERY smart. And by nature a database software system is a non-trivial piece of technology.
> 
> Initially we felt confident that the codebase was small enough and clean enough that should we really need to get our hands dirty in an emergency situation, we would be able to do so. Then Mango made the situation a bit blurrier, but we could easily ignore that, especially since we do not use it. However with FoundationDB... this becomes a whole different story.
> 
> The domain model of a database is non-trivial by nature, and now FoundationDb will introduce an additional level of abstraction and indirection, and a very serious one. I've been reading the design discussions since the FoundationDb announcement and there are a lot of impedance mistmatches requiring the domain model of CouchDb to be broken up in fictious entities intended to accomodate FoundationDb abstractions and their limitations (I'll back to this point in a moment).
> 
> Indirection is also introduced at the business logic level, with additional steps needing to be followed to emulate the desired behavior. All of this is complexity and obfuscation, and to be realistic, if we already struggled with the straight-to-the-point implementation, there is no way we'll be able to navigate (let alone hack), the FoundationDB-based implementation.
> 
> #(Apparent) Non-Alignment of FoundationDb with the reasons that made us love CouchDb
> 
> FoundationDb introduces limitations regarding transactions, document sizes and another number of critical items. One of the main reasons we use CouchDb is because of the way it allows us to develop applications rapidly and flexibly address all the state storage needs of application layers. CouchDb has you covered if you just want to dump large media file streamed with HTTP range requests while you iterate fast and your userbase is small, and replication allows you to seemless scale by distributing load on clusters in advanced ways without needing to redesign your applications. The user nkosi23 nicely describes some of the new possibilities enabled by CouchDb:
> 
> https://github.com/apache/couchdb/pull/1253#issuecomment-507043600
> 
> However, the limitations introduced by FoundationDb and the spirit of their project favoring abstraction purity through aggressive constraints, over operational flexibility is the opposite of the reasons we loved CouchDb and believed in it. It is to us pretty clear that the writing is on the wall. We aren't confident in FoundationDb to cover our bases, since covering our bases is explicitly not the goal of their project and their spirit is different from what has made CouchDb unique (ease of use, simple yet powerful and flexible abstractions etc...).
> 
> #Lack of commitment to the ideas pioneered
> 
> We feel like Couchdb itself undervalues the wealth of what it has brought to the table. For example when it comes to architecting load balancing for all sorts of applications with a single and transparent value store, CouchDb enables things that simply weren't possible before, and people will need time to understand how they can take advantage of them.
> 
> Nowadays we can see sed, awk and such be used in pretty clever ways, but it took time for people to incorporate the possibilities enabled by these tools in their thinking process (even though system administration are much easier to deploy than enterprise applications).
> 
> I think that CouchDb should have a 10 or 20-year outlook on the paradigm shifts its introduces, there is a need to give more place to faith and less place to data since not every usage will be adopted within 3 years. Sometimes you need to do things because you believe in them and you know you are right and that eventually people will come. But right now, it feels like customer statistics from Cloudant have become the main driver of the project. A balanced probably can be found between aligning with business realities and evangelism realities. I feel IBM guys are totally right to share their insights, but if there are not faith-zealots to counter-balance, then a positive may become a negative.
> 
> #What we plan to do
> 
> For all these reasons, CouchDb 3 will likely be the last release we will use. What we are about to activate is an effort to rewrite CouchDb to focus on the use case that we think makes CouchDb unique: a one-stop shop for all data storage needs, no matter the type of application and load. This means focusing on, on the one hand on working seamlessly with extremely large attachments and documents of any size, and on the other hand replication features (which goes hand in hand).
> 
> We will also seek to resurrect old features such as list views that we think need long-term faith. To make it possible from a bandwidth perspective, we will make a number of radical decisions. The two most important ones may be the following:
> 
> - Only map/reduce will be supported. Far from a limitation we see this as a way of life and a different way of thinking about designing line of business applications. Our finding is that a line of business applications never needs SQL style flexibility for the main app is the problem space has been correctly modeled (instead of being Excel in the web browser). When Business Analytics are really needed, the need is always very localized, and it is nowadays easy enough to have an ETL pipeline on a separate instance (especially considering CouchDb filtered replication capabilities).
> - Rewrite CouchDb in FSharp.
> 
> Rewriting in Fsharp will provide all the benefits of functional programming, while giving us access to a rich ecosystem of libraries, and a great static type checking system. All of this will mean more time to focus on the core features.
> 
> This is in a gist pretty much the plan. This is still early stages, and the way we do things, we would typically roll it out internally for a number of years before announcing it to the public. So I think there will likely be a 10-yearish window before you hear about this again.
> 
> I simply wanted to provide our feedback as a friendly contribution.