You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@couchdb.apache.org by GitBox <gi...@apache.org> on 2020/01/02 10:15:11 UTC

[GitHub] [couchdb] nicknaychov opened a new issue #2386: CouchDB Zones: How to verify if zones are setup correctly?

nicknaychov opened a new issue #2386: CouchDB Zones: How to verify if zones are setup correctly?
URL: https://github.com/apache/couchdb/issues/2386
 
 
   
   Hello & Happy New Year to all!
   
   I recently split my cluster into 2 zones. The question now is how to find out if that was successful?
   Could not find anything in your docs or with google.
   After all information related to setting, experimenting and use cases with CouchDB zoning seems very shy :)
   I can verify replication is working, since creating/deleting of DBs reflect to all nodes.
   I just want to ensure that R/W requests done in zone 1 are not send to zone 2. How this can be verified? 
   I enabled debug, but it looks this type of internal information is not logged.
   
   Any ideas would be welcome
   Thanks
   
   
   
   [NOTE]: # ( ^^ Provide a general summary of the RFC in the title above. ^^ )
   
   # Introduction
   
   ## Abstract
   
   [NOTE]: # ( Provide a 1-to-3 paragraph overview of the requested change. )
   [NOTE]: # ( Describe what problem you are solving, and the general approach. )
   
   ## Requirements Language
   
   [NOTE]: # ( Do not alter the section below. Follow its instructions. )
   
   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED",  "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in
   [RFC 2119](https://www.rfc-editor.org/rfc/rfc2119.txt).
   
   ## Terminology
   
   [TIP]:  # ( Provide a list of any unique terms or acronyms, and their definitions here.)
   
   ---
   
   # Detailed Description
   
   [NOTE]: # ( Describe the solution being proposed in greater detail. )
   [NOTE]: # ( Assume your audience has knowledge of, but not necessarily familiarity )
   [NOTE]: # ( with, the CouchDB internals. Provide enough context so that the reader )
   [NOTE]: # ( can make an informed decision about the proposal. )
   
   [TIP]:  # ( Artwork may be attached to the submission and linked as necessary. )
   [TIP]:  # ( ASCII artwork can also be included in code blocks, if desired. )
   
   # Advantages and Disadvantages
   
   [NOTE]: # ( Briefly, list the benefits and drawbacks that would be realized should )
   [NOTE]: # ( the proposal be accepted for inclusion into Apache CouchDB. )
   
   # Key Changes
   
   [TIP]: # ( If the changes will affect how a user interacts with CouchDB, explain. )
   
   ## Applications and Modules affected
   
   [NOTE]: # ( List the OTP applications or functional modules in CouchDB affected by the proposal. )
   
   ## HTTP API additions
   
   [NOTE]: # ( Provide *exact* detail on each new API endpoint, including: )
   [NOTE]: # (   HTTP methods [HEAD, GET, PUT, POST, DELETE, etc.] )
   [NOTE]: # (   Synopsis of functionality )
   [NOTE]: # (   Headers and parameters accepted )
   [NOTE]: # (   JSON in [if a PUT or POST type] )
   [NOTE]: # (   JSON out )
   [NOTE]: # (   Valid status codes and their defintions )
   [NOTE]: # (   A proposed Request and Response block )
   
   ## HTTP API deprecations
   
   [NOTE]: # ( Provide *exact* detail on the API endpoints to be deprecated. )
   [NOTE]: # ( If these endpoints are replaced by new endpoints, list those as well. )
   [NOTE]: # ( State the proposed version in which the deprecation and removal will occur. )
   
   # Security Considerations
   
   [NOTE]: # ( Include any impact to the security of CouchDB here. )
   
   # References
   
   [TIP]:  # ( Include any references to CouchDB documentation, mailing list discussion, )
   [TIP]:  # ( external standards or other links here. )
   
   # Acknowledgements
   
   [TIP]:  # ( Who helped you write this RFC? )
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [couchdb] kocolosk closed issue #2386: CouchDB Zones: How to verify if zones are setup correctly?

Posted by GitBox <gi...@apache.org>.
kocolosk closed issue #2386: CouchDB Zones: How to verify if zones are setup correctly?
URL: https://github.com/apache/couchdb/issues/2386
 
 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [couchdb] nicknaychov commented on issue #2386: CouchDB Zones: How to verify if zones are setup correctly?

Posted by GitBox <gi...@apache.org>.
nicknaychov commented on issue #2386: CouchDB Zones: How to verify if zones are setup correctly?
URL: https://github.com/apache/couchdb/issues/2386#issuecomment-574241520
 
 
   Yes it helps, thanks @wohali .
   
   My point was, in order to keep in sync the third node rather than sending R/W, ( quorum can be satisfied by the local 2 nodes), thus maybe binary replication protocol could be used or something more efficient. Just an idea, I am sure it is not easy and might need drastic changes on the backend.
   
   CouchDB 4.0 That would be awesome! Thanks for the link.
   
   What would be you recommendations then for people which have mirrored deployments on two DCs, both running live traffic and need to share same consistent data across both DCs. Each need to be aware of each other data with minimal delay.
   Would you still recommend master-master replication?  I heard for issues there as well - with increased latency and decreased throughput due to the serialization of data into the JSON format.
   
   Does this approach scale well if you have let's say 3 DCs?
   
   Thanks

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [couchdb] nicknaychov edited a comment on issue #2386: CouchDB Zones: How to verify if zones are setup correctly?

Posted by GitBox <gi...@apache.org>.
nicknaychov edited a comment on issue #2386: CouchDB Zones: How to verify if zones are setup correctly?
URL: https://github.com/apache/couchdb/issues/2386#issuecomment-572687087
 
 
   @kocolosk wow this is big. You just saved me a lot of time and I believe to many other people as well! 
   
   I think that should be put somewhere in important place of the documentation:
   
   **r,w and z parameters are not supported anymore.** 
   (Even better if couchdb display error if sees them, would be awesome. )
   
   **n parameter will be overridden by "placement" if present** - I saw this already in the docs. 
   
   I think CouchDB is great project but there is a lot of pitfalls and lack of good docs, which makes a lot of people to give up and move on with other solutions. 
   
   I think replication, n (replicas) and placement are maybe overlapping and confusing concepts for some, thus clarification when and what should be used will help many people I am sure.
   
   Example, in my case I have a lot of DBs which get added and removed on the fly by the upper layer logic, so I think I need special script for detecting that and start/stop DB replications. 
   So replication is not suitable in my case. 
   
   If I use the approach with n=3 replicas, 2 nodes on local site and one on remote site, that means unnecessary WAN load and delayed performance. 
   
   Thus I think optimal solution would be using DB placement with two replicas hosted on local site and 1 on remote. This way I think will achieve that R/W will occur only on the local site(if both nodes are up) to avoid unnecessary WAN delays, while in case of emergency I will have backup.
   
   If @kocolosk or somebody else can confirm my statements will be much appreciated.
   I think that would be good example and candidate for the best practices section - deployments.
   Thank you.
     

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [couchdb] nicknaychov commented on issue #2386: CouchDB Zones: How to verify if zones are setup correctly?

Posted by GitBox <gi...@apache.org>.
nicknaychov commented on issue #2386: CouchDB Zones: How to verify if zones are setup correctly?
URL: https://github.com/apache/couchdb/issues/2386#issuecomment-572457662
 
 
   thank you for answer @kocolosk .
   
   That makes things little bit clearer. Just to add, I do not use placement but just z=2. I rely on "automatic" placement, since I need second cluster to be used as backup in case main site is down. I use couchdb prior to introducing the placement option, so I am not sure if I should use it and how exactly will fit in my case. 
   
   Would you actually recommend using of placement param in favor of z parameter?
   
   Is there any link describing when we should use placement vs the z param? Maybe one of them should be deprecated in future releases if they overlap functionality in order to avoid confusion.
   
   In my case with 4 nodes, 2 zones(z=2) and n=3 without using DB placement, if request comes to zone 1 then R/W will be done only to the 2 nodes from that zone and there will be no WAN crossing, correct?
   My purpose is to avoid WAN crossing and thus slowing down the cluster.
   
   BTW Shard map looks ok:
   `"shards": {
           "00000000-55555554": [
               "couchdb@pbx1-z1.domain.ca",
               "couchdb@pbx1-z2.domain.ca",
               "couchdb@pbx2-z2.domain.ca"
           ],
           "55555555-aaaaaaa9": [
               "couchdb@pbx1-z1.domain.ca",
               "couchdb@pbx1-z2.domain.ca",
               "couchdb@pbx2-z1.domain.ca"
           ],
           "aaaaaaaa-ffffffff": [
               "couchdb@pbx1-z2.domain.ca",
               "couchdb@pbx2-z1.domain.ca",
               "couchdb@pbx2-z2.domain.ca"
           ]
       }`
   
   Many thanks to anybody who can shed some light on this.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [couchdb] nicknaychov edited a comment on issue #2386: CouchDB Zones: How to verify if zones are setup correctly?

Posted by GitBox <gi...@apache.org>.
nicknaychov edited a comment on issue #2386: CouchDB Zones: How to verify if zones are setup correctly?
URL: https://github.com/apache/couchdb/issues/2386#issuecomment-572687087
 
 
   @kocolosk wow this is big. You just saved me a lot of time and I believe to many other people as well! 
   
   I think that should be put somewhere in important place of the documentation:
   
   **r,w and z parameters are not supported anymore.** 
   (Even better if couchdb display error if sees them, would be awesome. )
   
   **n parameter will be overridden by "placement" if present** - I saw this already in the docs. 
   
   I think CouchDB is great project but there is a lot of pitfalls and lack of good docs, which makes a lot of people to give up and move on with other solutions. 
   
   I think replication, n (replicas) and placement are maybe overlapping and confusing concepts for some, thus clarification when and what should be used will help many people I am sure.
   
   Example, in my case I have a lot of DBs which get added and removed on the fly by the upper layer logic, so I think I need special script for detecting that and start/stop DB replications. 
   So replication is not suitable in my case. 
   
   If I use the approach with n=3 replicas, 2 nodes on local site and one on remote site, that means unnecessary WAN load and delayed performance. 
   
   Thus I think optimal solution would be using DB placement with two replicas hosted on 2 nodes in the local site and 1 on remote site node. This way I think will achieve that R/W will occur only on the local site(if both nodes are up) and avoid unnecessary WAN delays, while in case of emergency I will have backup.
   
   If @kocolosk or somebody else can confirm my statements will be much appreciated.
   I think that would be good example and candidate for the best practices section - deployments.
   Thank you.
     

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [couchdb] kocolosk commented on issue #2386: CouchDB Zones: How to verify if zones are setup correctly?

Posted by GitBox <gi...@apache.org>.
kocolosk commented on issue #2386: CouchDB Zones: How to verify if zones are setup correctly?
URL: https://github.com/apache/couchdb/issues/2386#issuecomment-571716339
 
 
   Hi @nicknaychov, are you talking about the `placement` option described here?
   
   https://docs.couchdb.org/en/stable/cluster/sharding.html#specifying-database-placement
   
   That setting controls which nodes host replicas of database shards. CouchDB has some internal optimizations to preferentially retrieve DB metadata from nodes in the same zone as the node handling the incoming HTTP request, but general R/W traffic will cross zones and communicate with every replica of a database shard as needed. Typically the placement setting is used to ensure replicas are distributed to different fault domains, e.g. different availability zones in a cloud region.
   
   If you want to confirm that the placement you configured took effect for a particular database you can query the `_shards` endpoint:
   
   https://docs.couchdb.org/en/stable/api/database/shard.html
   
   Even in the case where you configured the placement so that the shards for database A are hosted on one set of nodes and the shards for database B are hosted on another set, the cluster will still cross zones as needed to satisfy R/W requests. For example, you could submit a request to one of the nodes hosting A asking to read a document for B and that request will succeed -- it will just issue the internal RPCs to retrieve the document data from the nodes in the other zone.
   
   Hopefully that makes sense. Also dropping the `rfc` label as that's intended for formal proposals to enhance CouchDB. 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [couchdb] nicknaychov edited a comment on issue #2386: CouchDB Zones: How to verify if zones are setup correctly?

Posted by GitBox <gi...@apache.org>.
nicknaychov edited a comment on issue #2386: CouchDB Zones: How to verify if zones are setup correctly?
URL: https://github.com/apache/couchdb/issues/2386#issuecomment-573329244
 
 
   Thank you for your super useful comments as usual, @kocolosk and @wohali.
   
   The buffering @kocolosk mentioned, which can cause outage in primary zone in case back zone goes offline, however is pretty scary to me. 
   
   I guess same is valid vice versa. In case primary goes offline, then backup site DB could crash due to the buffering issue you mentioned. 
   
   Any advises how to avoid issues on primary zone if backup goes offline. 
   
   If instead *placement* I use the classical approach with *n=3* will that make any difference? I guess not.
   
   Maybe there is setting to increase the buffering?
   
   Aside from that, It seems that the placement implementation does not benefit of site zoning at all, i.e. if you have two sites and if quorum can be satisfied only by the nodes at site A (as in my case), nevertheless R/W will still be sent over WAN to site B, which I think is a pure waste of resources. If that can be optimized, then this could possibly solve the buffering issue you mentioned?
   I am sure you know that, but I guess there is some serious considerations against improvements in that field.
   
   Thank you,
   you are the best!
   
   
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [couchdb] nicknaychov commented on issue #2386: CouchDB Zones: How to verify if zones are setup correctly?

Posted by GitBox <gi...@apache.org>.
nicknaychov commented on issue #2386: CouchDB Zones: How to verify if zones are setup correctly?
URL: https://github.com/apache/couchdb/issues/2386#issuecomment-572687087
 
 
   @kocolosk wow this is big. You just saved me a lot of time and I believe to many other people as well! 
   
   I think that should be put somewhere in important place of the documentation:
   
   **r,w and z parameters are not supported anymore.** 
   (Even better if couchdb display error if sees them, would be awesome. )
   
   **n parameter will be overridden by "placement"** - I saw this already in the docs. 
   
   I think CouchDB is great project but there is a lot of pitfalls and lack of good docs, which makes a lot of people to give up and move on with other solutions. 
   
   I think replication, n (replicas) and placement are maybe overlapping and confusing concepts, thus clarification when and what should be used will help many people I am sure.
   
   Example, in my case I have a lot of DBs which get added and removed on the fly by the upper layer logic, so I think I need special script for detecting that and start/stop DB replications. 
   So replication is not suitable in my case. 
   
   If I use the approach with n=3 replicas, 2 nodes on local site and one on remote site, that means unnecessary WAN load and delayed performance. 
   
   Thus I think optimal solution would be using DB placement with two replicas hosted on local site and 1 on remote. This way I think will achieve that R/W will occur only on the local site(if both nodes are up) to avoid unnecessary WAN delays, while in case of emergency I will have backup.
   
   If @kocolosk or somebody else can confirm my statements will be much appreciated.
   I think that would be good example and candidate for the best practices section - deployments.
   Thank you.
     

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [couchdb] kocolosk commented on issue #2386: CouchDB Zones: How to verify if zones are setup correctly?

Posted by GitBox <gi...@apache.org>.
kocolosk commented on issue #2386: CouchDB Zones: How to verify if zones are setup correctly?
URL: https://github.com/apache/couchdb/issues/2386#issuecomment-573251433
 
 
   The way I see it, you're both correct 😸
   
   Configuring the placement topology as @nicknaychov described with 2 copies in the "primary" zone and one copy in the "backup" zone will generally yield local latencies for read and write operations in the primary zone in a healthy cluster. I've seen large clusters operate this way using a backup zone ~35ms away. I also acknowledge that configuring an HTTP replication per database for a large number of databases is an onerous task and the placement configuration looks like a nice way to sidestep that burden.
   
   @wohali is correct that the multi-zone placement configuration will use a lot of WAN bandwidth, more than using HTTP replication if the read load is high.
   
   The problem with the WAN multi-zone configuration is that we haven't really optimized the RPC and networking mechanisms inside the cluster for dealing with that kind of inter-node latency. I've seen outages in the *primary* zone caused by the backup zone going offline, as the primary zone starts buffering a lot of messages to send over the failed link. We've addressed many of those issues, but the fact remains that it's not as well-tested as a configuration where all the zones are in the same metro area.
   
   If you do go that route, I'd encourage you to simulate a failed link between the zones with some realistic workload on the cluster and make sure it responds appropriately. 
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [couchdb] nicknaychov commented on issue #2386: CouchDB Zones: How to verify if zones are setup correctly?

Posted by GitBox <gi...@apache.org>.
nicknaychov commented on issue #2386: CouchDB Zones: How to verify if zones are setup correctly?
URL: https://github.com/apache/couchdb/issues/2386#issuecomment-573329244
 
 
   Thank you for your super useful comments as usual, @kocolosk and @wohali.
   
   The buffering @kocolosk mentioned, which can cause outage in primary zone in case back zone goes offline, is pretty scary to me however. 
   
   I guess same is valid vice versa. In case primary goes offline, then backup site DB could crash due to the buffering issue you mentioned. 
   
   Any advises how to avoid issues on primary zone if backup goes offline. 
   
   If instead *placement* I use the classical approach with *n=3* will that make any difference? I guess not.
   
   Maybe there is setting to increase the buffering?
   
   Aside from that, It seems that the placement implementation does not benefit of site zoning at all, i.e. if you have two sites and if quorum can be satisfied only by the nodes at site A (as in my case), nevertheless R/W will still be sent over WAN to site B, which I think is a pure waste of resources. If that can be optimized, then this could possibly solve the buffering issue you mentioned?
   I am sure you know that, but I guess there is some serious considerations against improvements in that field.
   
   Thank you,
   you are the best!
   
   
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [couchdb] kocolosk commented on issue #2386: CouchDB Zones: How to verify if zones are setup correctly?

Posted by GitBox <gi...@apache.org>.
kocolosk commented on issue #2386: CouchDB Zones: How to verify if zones are setup correctly?
URL: https://github.com/apache/couchdb/issues/2386#issuecomment-572620493
 
 
   Ah, I'd forgotten about that option. I'm ... not sure that was ever supported in a released version of CouchDB (as opposed to the BigCouch fork). We removed it in d5f5ff2ccdb44df5dbc0df61169163d90aced978. So yes, I would recommend you use the placement config setting instead.
   
   Neither `z` nor `placement` will have much of an effect on the WAN traffic. There's a small effect, but CouchDB will definitely still send a fair amount of traffic across zones. See my comment on #2329 for more detail.
   
   If you want a second deployment as a backup our current recommendation is to run a separate cluster and configure replications from the primary to the backup.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [couchdb] wohali commented on issue #2386: CouchDB Zones: How to verify if zones are setup correctly?

Posted by GitBox <gi...@apache.org>.
wohali commented on issue #2386: CouchDB Zones: How to verify if zones are setup correctly?
URL: https://github.com/apache/couchdb/issues/2386#issuecomment-574300980
 
 
   It scales differently than using CouchDB cluster traffic for the same approach, because of the limitations mentioned. [There are always tradeoffs in distributed computing.](https://en.wikipedia.org/wiki/Fallacies_of_distributed_computing).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [couchdb] wohali commented on issue #2386: CouchDB Zones: How to verify if zones are setup correctly?

Posted by GitBox <gi...@apache.org>.
wohali commented on issue #2386: CouchDB Zones: How to verify if zones are setup correctly?
URL: https://github.com/apache/couchdb/issues/2386#issuecomment-573337808
 
 
   Hi again,
   
   @kocolosk I remember a situation where outages in the primary zone were caused by the backup zone going offline, where the backup zone was halfway across the US, with you many years ago. That was no fun, and often was uncontrollable (when the WAN link was extra slow, for instance, so the "remote" nodes weren't _completely_ offline.)
   
   @nicknaychov The key is that with _n=3_ and the third shard being across that WAN link, when a node or disk was out at the _primary_ site, now you must wait for the remote site to respond for any task to complete. The extra wait places pressure onto the buffering queues, and slowed the database from an end user/application perspective, so queues farther back up the line start filling up, too. Eventually, this can lead to a cascading failure.
   
   Incidentally, I don't think that it's true that sending the traffic to site B is a "pure waste of resources," especially for write traffic where one of the shards per database is stored at that site. (That remote shard needs to be updated somehow!) And if that remote shard falls farther behind, because of buffered queues, the situation where a node at the primary site goes offline becomes worse. Now you only have 2 copies online (assuming _n=3_) that disagree with each other, so the cluster cannot achieve quorum for your database. As the remaining node at the primary site becomes overloaded, you *will* start getting responses from the remote site being sent to clients, meaning outdated data. Your only indication for this is a 202 response, and many HTTP/CouchDB client libraries do not distinguish between 201 and 202. (Yours may, I wouldn't know...)
   
   If your intent is to have disaster recovery or "hot standby" at a remote site, to have that standby node/cluster as primarily a consumer of data from the main cluster, and have live switchover to it possible, the absolute best way to do this is to treat the two sites as separate clusters (or a cluster and a single backup node) and use multiple standard HTTP replications. You'll be in good company with this approach, it's tested and recommended, and more stable and reliable than relying on Erlang distribution RPC traffic over a WAN link.
   
   Since it's not mentioned yet, I'll bring up that CouchDB 4.0 completely replaces the clustering and networking layer with a new implementation, based on FoundationDB. You can read more about it on our mailing lists and at [this article](https://www.ibm.com/cloud/blog/new-builders/database-deep-dives-couchdb).
   
   Because of this, the reality is that we're not going to be making any significant changes to the clustering code at this time. More than that, the zone/placement code is a very rarely used feature of CouchDB - outside of people coming from BigCouch, which is primarily limited to SIP server/kazoo-influenced users like yourself, where BigCouch found some traction.
   
   It _may_ be possible to integrate improvements to the clustering code for this situation, but optimizing the RPC for dealing with increased intra-cluster latency isn't high on the core development team's list. We'd certainly entertain a pull request for the 3.x series working on this code, but it's not a trivial matter to sort out.
   
   Does this help?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [couchdb] wohali commented on issue #2386: CouchDB Zones: How to verify if zones are setup correctly?

Posted by GitBox <gi...@apache.org>.
wohali commented on issue #2386: CouchDB Zones: How to verify if zones are setup correctly?
URL: https://github.com/apache/couchdb/issues/2386#issuecomment-572732823
 
 
   @nicknaychov No, this is not correct:
   
   > Thus I think optimal solution would be using DB placement with two replicas hosted on 2 nodes in the local site and 1 on remote site node. This way I think will achieve that R/W will occur only on the local site(if both nodes are up) and avoid unnecessary WAN delays, while in case of emergency I will have backup.
   
   Every attempt to read or write will always attempt to access all replicas of a given database shard. Even if 2 of those 3 responses arrive locally and result in a faster response to the client, you still have all traffic for every request transitting your WAN.
   
   As @kocolosk said, your best approach is to use a standard main 3-node cluster (1 zone), then use standard CouchDB replication to keep your offsite backup current. You can run these replications on that backup cluster (i.e. "pull" replication) to keep load on the main cluster light.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [couchdb] nicknaychov edited a comment on issue #2386: CouchDB Zones: How to verify if zones are setup correctly?

Posted by GitBox <gi...@apache.org>.
nicknaychov edited a comment on issue #2386: CouchDB Zones: How to verify if zones are setup correctly?
URL: https://github.com/apache/couchdb/issues/2386#issuecomment-572687087
 
 
   @kocolosk wow this is big. You just saved me a lot of time and I believe to many other people as well! 
   
   I think that should be put somewhere in important place of the documentation:
   
   **r,w and z parameters are not supported anymore.** 
   (Even better if couchdb display error if sees them, would be awesome. )
   
   **n parameter will be overridden by "placement" if present** - I saw this already in the docs. 
   
   I think CouchDB is great project but there is a lot of pitfalls and lack of good docs, which makes a lot of people to give up and move on with other solutions. 
   
   I think replication, n (replicas) and placement are maybe overlapping and confusing concepts for some, thus clarification when and what should be used will help many people I am sure.
   
   Example, in my case I have a lot of DBs which get added and removed on the fly by the upper layer logic, so I think I need special script for detecting that and start/stop DB replications. 
   So replication is not suitable in my case. 
   
   If I use the approach with n=3 replicas, 2 nodes on local site and one on remote site, that means unnecessary WAN load and delayed performance. 
   
   Thus I think optimal solution would be using DB placement with two replicas hosted on 2 nodes on the local site and 1 on remote node. This way I think will achieve that R/W will occur only on the local site(if both nodes are up) and avoid unnecessary WAN delays, while in case of emergency I will have backup.
   
   If @kocolosk or somebody else can confirm my statements will be much appreciated.
   I think that would be good example and candidate for the best practices section - deployments.
   Thank you.
     

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services