You are viewing a plain text version of this content. The canonical link for it is here.

Posted to notifications@couchdb.apache.org by GitBox <gi...@apache.org> on 2019/12/26 17:27:03 UTC

[GitHub] [couchdb] nicknaychov opened a new issue #2382: CouchDB backup instance - zones vs replicas?

nicknaychov opened a new issue #2382: CouchDB backup instance - zones vs replicas?
URL: https://github.com/apache/couchdb/issues/2382
 
 
   Hi 
   I just got a nontraditional Christmas thought popping up in my mind :). I would like to setup backup of the main CouchDB install in second datacenter. Note that second setup will work only in case the first setup in the main datacenter is down. In the main datacenter I currently have  3 Couchdb's. In the second datacenter I would like to have 1 CouchDB. I do not want to mirror first setup of setting 3 VMs due to overhead related with that of managing&supporting another 3 VMs, plus cost is also consideration. 
   
   Initially I was thinking of setting second zone for CouchDB back up site with settings:
   
   q=3
   r=1
   w=1
   n=1
   z=2
   but not sure how adding new zone to existing setup works or if I need to do some re-balancing (I've never done CouchDB zoning). Plus I think this might be "over-engineering" of what I am trying to do. So I would like to find simpler/more effective approach, just like I do for each thing :)
   
   What about If I just move one of the three CouchDB nodes to the backup site?
   
   I known the read/writes should not happens over WAN and only replication should so what would be the best approach?
   
   Current setup is:
   
   ```
   q=3
   r=2
   w=2
   n=3
   ```
   so I am thinking of changing to :
   
   ```
   q=3
   r=1
   w=1
   n=3
   ```
   Thus I will guarantee that read and writes will not leave the site and happens over WAN.  This is important due to security concerns and prevent slow downs due to DB quorum requirements, while optimizing server setup at the same time?
   
   Or this is stupid and I should just use 2 zones?
   
   Any feedback or thoughts how you would backup your site instance and if above would work well will be much appreciated.
   
   Thanks 
   
   
   
   [NOTE]: # ( ^^ Provide a general summary of the RFC in the title above. ^^ )
   
   # Introduction
   
   ## Abstract
   
   [NOTE]: # ( Provide a 1-to-3 paragraph overview of the requested change. )
   [NOTE]: # ( Describe what problem you are solving, and the general approach. )
   
   ## Requirements Language
   
   [NOTE]: # ( Do not alter the section below. Follow its instructions. )
   
   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED",  "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in
   [RFC 2119](https://www.rfc-editor.org/rfc/rfc2119.txt).
   
   ## Terminology
   
   [TIP]:  # ( Provide a list of any unique terms or acronyms, and their definitions here.)
   
   ---
   
   # Detailed Description
   
   [NOTE]: # ( Describe the solution being proposed in greater detail. )
   [NOTE]: # ( Assume your audience has knowledge of, but not necessarily familiarity )
   [NOTE]: # ( with, the CouchDB internals. Provide enough context so that the reader )
   [NOTE]: # ( can make an informed decision about the proposal. )
   
   [TIP]:  # ( Artwork may be attached to the submission and linked as necessary. )
   [TIP]:  # ( ASCII artwork can also be included in code blocks, if desired. )
   
   # Advantages and Disadvantages
   
   [NOTE]: # ( Briefly, list the benefits and drawbacks that would be realized should )
   [NOTE]: # ( the proposal be accepted for inclusion into Apache CouchDB. )
   
   # Key Changes
   
   [TIP]: # ( If the changes will affect how a user interacts with CouchDB, explain. )
   
   ## Applications and Modules affected
   
   [NOTE]: # ( List the OTP applications or functional modules in CouchDB affected by the proposal. )
   
   ## HTTP API additions
   
   [NOTE]: # ( Provide *exact* detail on each new API endpoint, including: )
   [NOTE]: # (   HTTP methods [HEAD, GET, PUT, POST, DELETE, etc.] )
   [NOTE]: # (   Synopsis of functionality )
   [NOTE]: # (   Headers and parameters accepted )
   [NOTE]: # (   JSON in [if a PUT or POST type] )
   [NOTE]: # (   JSON out )
   [NOTE]: # (   Valid status codes and their defintions )
   [NOTE]: # (   A proposed Request and Response block )
   
   ## HTTP API deprecations
   
   [NOTE]: # ( Provide *exact* detail on the API endpoints to be deprecated. )
   [NOTE]: # ( If these endpoints are replaced by new endpoints, list those as well. )
   [NOTE]: # ( State the proposed version in which the deprecation and removal will occur. )
   
   # Security Considerations
   
   [NOTE]: # ( Include any impact to the security of CouchDB here. )
   
   # References
   
   [TIP]:  # ( Include any references to CouchDB documentation, mailing list discussion, )
   [TIP]:  # ( external standards or other links here. )
   
   # Acknowledgements
   
   [TIP]:  # ( Who helped you write this RFC? )
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [couchdb] nickva commented on issue #2382: CouchDB backup instance - zones vs replicas?

Posted by GitBox <gi...@apache.org>.

nickva commented on issue #2382: CouchDB backup instance - zones vs replicas?
URL: https://github.com/apache/couchdb/issues/2382#issuecomment-573295902
 
 
   @nicknaychov 
   
   Thank you for sharing your experience and the discussion.  Sounds like your setup might work as you described it above in https://github.com/apache/couchdb/issues/2382#issuecomment-569491393. Just make sure you keep the same node names if possible when/if you move your shards, it will make things easier and keep an eye on the network latency and connectivity.
   
   Regarding security on the WAN you could try a VP,  or use TLS (see https://www.erlang-solutions.com/blog/erlang-distribution-over-tls.html). TLS will usually work better with later versions of Erlang, if you go that route.
   
   Since there is discussion in #2386, which answers well your questions about placement and R/W optimization let's close this ticket.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [couchdb] nickva commented on issue #2382: CouchDB backup instance - zones vs replicas?

Posted by GitBox <gi...@apache.org>.

nickva commented on issue #2382: CouchDB backup instance - zones vs replicas?
URL: https://github.com/apache/couchdb/issues/2382#issuecomment-569161119
 
 
   I have never tried adding zones later after the setup has been running for a while already. Also running a cluster over a WAN might need a bit of tweaking.
   
   Maybe someone else can chime in with more info about zones, I am not too familiar with it. (The  docs http://docs.couchdb.org/en/latest/cluster/databases.html should be here though).
   
   However, I'd like to offer an alternative - run regular replications from the primary cluster to the secondary. So set up a secondary database as a separate cluster and then have a script to discover dbs in the primary instance and set up regular CouchDB replications to the this disaster recovery cluster. You can tweak replication worker numbers, batch size and `[scheduler] max_jobs` to limit resources used. You'd get explicit monitoring, restart and backoff in case of network errors or disconnects, etc.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [couchdb] rnewson commented on issue #2382: CouchDB backup instance - zones vs replicas?

Posted by GitBox <gi...@apache.org>.

rnewson commented on issue #2382: CouchDB backup instance - zones vs replicas?
URL: https://github.com/apache/couchdb/issues/2382#issuecomment-569463499
 
 
   Also I'm curious about q=3. q is unrelated to your problem, it's the amount of "scale" the db has, and is related most strongly with document count. it's not the number of replicas of your data.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [couchdb] nicknaychov edited a comment on issue #2382: CouchDB backup instance - zones vs replicas?

Posted by GitBox <gi...@apache.org>.

nicknaychov edited a comment on issue #2382: CouchDB backup instance - zones vs replicas?
URL: https://github.com/apache/couchdb/issues/2382#issuecomment-569236344
 
 
   Thanks a lot for your input.
   This also crossed my mind, but I think might be too complex, while I would like more elegant and issue free solution. There must be a way without doing custom scripts.
   
   I read the docs numerous times but I could not find the answers of those questions and I do not think I am the only one who would benefit of them:
   
   When we should use zones and when replicas?
   
   Can we and how to add zones later to existing setup?
   
   What is the best/optimal way to setup remote "hot" site backup of existing cluster?
   
   Optimal settings for cluster deployed at different remote sites?
   
   I think if they add this practical cases with examples to documentation would be great benefit to the community.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [couchdb] nicknaychov commented on issue #2382: CouchDB backup instance - zones vs replicas?

Posted by GitBox <gi...@apache.org>.

nicknaychov commented on issue #2382: CouchDB backup instance - zones vs replicas?
URL: https://github.com/apache/couchdb/issues/2382#issuecomment-572965171

I will try to combine my findings so far, hopefully would be useful to somebody else.

Z parameter is no longer supported, but even if you use it CouchDB will not complain so you will end up with *fake* zoning see #2386 . Thus the link https://web.archive.org/web/20160429122538/https://cloudant.com/blog/choosing-zone-configurations-for-bigcouch is not much relevant and do not try to apply it.

Use placement instead, which seems smarter and more user-friendly.

CouchDB does not have R/W optimizations when you use placement (zoning), which I think should be improved, thus performance.

It seems that the *placement* implementation does not benefit of zoning at all, i.e. if you have two sites and if quorum can be satisfied only by the nodes in site A, R/W will still be sent over WAN to site B, which I think is a pure waste of resources.
Example 2 nodes in site A and 1 node in site B, placement = <zone-name-1>:2,<zone-name-2>:1

Am I right, or I misunderstood something?

Thanks

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

With regards,
Apache Git Services

[GitHub] [couchdb] nickva closed issue #2382: CouchDB backup instance - zones vs replicas?

Posted by GitBox <gi...@apache.org>.

nickva closed issue #2382: CouchDB backup instance - zones vs replicas?
URL: https://github.com/apache/couchdb/issues/2382
 
 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [couchdb] nicknaychov commented on issue #2382: CouchDB backup instance - zones vs replicas?

Posted by GitBox <gi...@apache.org>.

nicknaychov commented on issue #2382: CouchDB backup instance - zones vs replicas?
URL: https://github.com/apache/couchdb/issues/2382#issuecomment-569491393

@nickva thanks for the really useful thoughts and links. It looks I can not escape from spinning up zones. I currently have 3 nodes with n=3 i.e. each node have full copy of the DB. I am trying to avoid shard moving, I am un-experienced and I feel will broke the things, so I am thinking of creating 2 zones, first with with 2 nodes in the main datacenter and second zone with just one node. All this using existing setup, without creating brand new cluster, by just moving one of the nodes from the main datacenter to the second one.
I am not sure if this will work so please comment on this. In theory I think it should be valid since using this approach altering of existing shard map is not needed.
@rnewson The upper layer logic would create and remove DBs on the fly constantly and I think if I use replication instead of zones for data sync, that will not work out of the box, but I will have to create custom script as @nickva noted above. Also in the link which he posted it is clearly mentioned that using of replication for inter-cluster sync is very bad idea - "While perfectly acceptable, this has a vastly reduced guarantee of data freshness, even using continuous replication, plus increased latency and decreased throughput due to the serialization of data into the JSON format required for replication."
In conclusion, beside my experiment for setting zoning without doing re-balancing using the 3 existing nodes with n=3, I would like also to ask you guys for comment on if there is a secure way data to be replicated over WAN using zones?
Thanks a lot

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

With regards,
Apache Git Services

[GitHub] [couchdb] nicknaychov edited a comment on issue #2382: CouchDB backup instance - zones vs replicas?

Posted by GitBox <gi...@apache.org>.

nicknaychov edited a comment on issue #2382: CouchDB backup instance - zones vs replicas?
URL: https://github.com/apache/couchdb/issues/2382#issuecomment-572965171

I will try to combine my findings so far, hopefully would be useful to somebody else.

Use placement instead, which seems smarter and more user-friendly.

CouchDB does not have R/W optimizations when you use placement (zoning), which I think should be improved, thus performance.

It seems that the *placement* implementation does not benefit of site *zoning* at all, i.e. if you have two sites and if quorum can be satisfied only by the nodes in site A, R/W will still be sent over WAN to site B, which I think is a pure waste of resources.
Example 2 nodes in site A and 1 node in site B, placement = <zone-name-1>:2,<zone-name-2>:1

Am I right, or I misunderstood something?

Thanks

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

With regards,
Apache Git Services

[GitHub] [couchdb] rnewson commented on issue #2382: CouchDB backup instance - zones vs replicas?

Posted by GitBox <gi...@apache.org>.

rnewson commented on issue #2382: CouchDB backup instance - zones vs replicas?
URL: https://github.com/apache/couchdb/issues/2382#issuecomment-569463365
 
 
   As Nick has noted, Erlang expect low latency between nodes and there is only minimal security for that traffic by default (it's not encrypted), so you should not typically span datacenters (though see below) or geographical regions within a couchdb cluster. For inter-cluster data sync, use replication (over https).
   
   The "r" and "w" config parameters are ignored by the system these days and are calculated from "n" (as n/2+1). You can override with query parameters but it is not recommended. The properties you think will be guaranteed by this will not be (and, as per my first paragraph, the cluster topology you would be using is both badly performing and insecure).
   
   On zones, these exist to help map the shards (q*n) of a database evenly across failure domains. In practice this would be nodes in adjacent datacenters with very good low-latency connectivity (think AWS zone), for the purpose of surviving a datacenter failure. 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [couchdb] nicknaychov commented on issue #2382: CouchDB backup instance - zones vs replicas?

Posted by GitBox <gi...@apache.org>.

nicknaychov commented on issue #2382: CouchDB backup instance - zones vs replicas?
URL: https://github.com/apache/couchdb/issues/2382#issuecomment-569236344
 
 
   Thanks a lot for your input.
   This also crossed my mind, but I think might be too complex, while I would like more elegant and issue free solution. There must be a way without doing custom scripts.
   
   I red the docs numerous times but I could not find the answers of those questions and I do not think I am the only one who would benefit of them:
   
   When we should use zones and when replicas?
   
   Can we and how to add zones later to existing setup?
   
   What is the best/optimal way to setup remote "hot" site backup of existing cluster?
   
   Optimal settings for cluster deployed at different remote sites?
   
   I think if they add this practical cases with examples to documentation would be great benefit to the community.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [couchdb] nickva commented on issue #2382: CouchDB backup instance - zones vs replicas?

Posted by GitBox <gi...@apache.org>.

nickva commented on issue #2382: CouchDB backup instance - zones vs replicas?
URL: https://github.com/apache/couchdb/issues/2382#issuecomment-569391582
 
 
   @nicknaychov 
   
   I'll try to answer some of the questions you had
   
   > When we should use zones and when replicas?
   
   A placement setting using zones will override replicas. So setting a default placement or providing it in the db creation HTTP API request will use that setting over what `n` values holds. So specifying `.../?&placement=zone_a:2,zone_b:2&n=3` would create 4 copies instead of 3. One thing to watch out for here is that `zone_a` and `zone_b` have at least two nodes each.
   
   > Can we and how to add zones later to existing setup?
   
   You'd decide what the zone configuration looks like, then edit the `_nodes` metadata db to add zone labels to each node, then edit the `_dbs` and update the shard map. This would be the re-balance part you mentioned. You could follow some of the instructions from shard moving doc section: https://docs.couchdb.org/en/stable/cluster/sharding.html#moving-a-shard
   
   This would have to be done right and in the correct order or all of the sudden you data won't be "found" by the incoming requests. 
   
   It's much easier if the dbs are created with the new setting already...
   
   > What is the best/optimal way to setup remote "hot" site backup of existing cluster?
   
   In general I mostly have seen simple replications there :-) but using zones you'd do the outline above and then maybe follow this archived blog https://web.archive.org/web/20160429122538/https://cloudant.com/blog/choosing-zone-configurations-for-bigcouch which discusses the issue pretty well I thought. The top recommendation there I think is to run 3 zones: 2 primary ones with 2 nodes each, and 1 backup node in its own zone.
   
   > Optimal settings for cluster deployed at different remote sites?
   
   The danger there is running Erlang clustering over WAN. By default it is tuned to run over LAN type networks. Might need to do more research there but I know of at least one parameter `+zdbbl <KB>` that changes the Erlang distribution (clustering) buffer size. It might be good to have a larger value. http://erlang.org/doc/man/erl.html#+zdbbl that change would go into the `vm.args` file.
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [couchdb] nickva edited a comment on issue #2382: CouchDB backup instance - zones vs replicas?

Posted by GitBox <gi...@apache.org>.

nickva edited a comment on issue #2382: CouchDB backup instance - zones vs replicas?
URL: https://github.com/apache/couchdb/issues/2382#issuecomment-573295902
 
 
   @nicknaychov 
   
   Thank you for sharing your experience and the discussion.  Sounds like your setup might work as you described it above in https://github.com/apache/couchdb/issues/2382#issuecomment-569491393. Just make sure you keep the same node names if possible when/if you move your shards, it will make things easier and keep an eye on the network latency and connectivity.
   
   Regarding security on the WAN you could try a VPN,  or use TLS (see https://www.erlang-solutions.com/blog/erlang-distribution-over-tls.html). TLS will usually work better with later versions of Erlang, if you go that route.
   
   Since there is discussion in #2386, which answers well your questions about placement and R/W optimization let's close this ticket.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [couchdb] nicknaychov edited a comment on issue #2382: CouchDB backup instance - zones vs replicas?

Posted by GitBox <gi...@apache.org>.

nicknaychov edited a comment on issue #2382: CouchDB backup instance - zones vs replicas?
URL: https://github.com/apache/couchdb/issues/2382#issuecomment-569491393

@nickva thanks for the really useful thoughts and links. It looks I can not escape from spinning up zones. I currently have 3 nodes with n=3 i.e. each node have full copy of the DB. I am trying to avoid shard moving, I am un-experienced and I feel will broke the things, so I am thinking of creating 2 zones, first with 2 nodes in the main datacenter and second zone with just one node. All this using existing setup, without creating brand new cluster or re-balancing the existing, by just moving one of the nodes from the main datacenter to the second one.
I am not sure if this will work so please comment on this. In theory I think it should be valid since using this approach altering of existing shard map is not needed.
@rnewson The upper layer logic would create and remove DBs on the fly constantly and I think if I use replication instead of zones for data sync, that will not work out of the box, but I will have to create custom script as @nickva noted above. Also in the link which he posted it is clearly mentioned that using of replication for inter-cluster sync is very bad idea - "While perfectly acceptable, this has a vastly reduced guarantee of data freshness, even using continuous replication, plus increased latency and decreased throughput due to the serialization of data into the JSON format required for replication."
In conclusion, beside my experiment for setting zoning without doing re-balancing using the 3 existing nodes with n=3, I would like also to ask you guys for comment on if there is a secure way data to be replicated over WAN using zones?
Thanks a lot

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

With regards,
Apache Git Services