You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@couchdb.apache.org by GitBox <gi...@apache.org> on 2019/10/24 01:03:33 UTC

[GitHub] [couchdb] kocolosk opened a new issue #2271: Connection sharing is corrupted when replicating using a proxy

kocolosk opened a new issue #2271: Connection sharing is corrupted when replicating using a proxy
URL: https://github.com/apache/couchdb/issues/2271
 
 
   ## Description
   
   In apache/couchdb@2505436 we introduced an optimization to share connections to given host:port across different replications. I believe this functionality results in broken replications when using a forward proxy, as every connection that uses the proxy is tossed into the same pool, resulting in requests being directed to the wrong hosts.
   
   ## Steps to Reproduce
   
   1. Set up a simple forward proxy. Squid via Homebrew worked for me.
   1. Configure a replication via the proxy where the source and target use different hosts. I used a database hosted in Cloudant as the source and a database on my dev box at 127.0.0.1:15984 as the target
   1. Watch as the replicator sends requests to the wrong locations. In my case I found that the replication crashed complaining about a 401 Unauthorized against my (admin party) dev setup, but when I turned on ibrowse tracing I saw that the response had actually come from Cloudant.
   
   ## Expected Behaviour
   
   Replication should replicate data.
   
   ## Your Environment
   
   CouchDB 3.0.0-201d5935c on macOS Catalina
   
   ## Additional context
   
   #1080 also talked about replication failing behind a proxy years ago, but I think this is a different issue. I do agree with the suggestion in that issue that configuring proxies separately for the source and the target makes a lot of sense.
   
   As a test I tried the following patch to cause the shared connection pool to look at the final host:port instead of the proxy host:port
   
   ```
   diff --git a/src/couch_replicator/src/couch_replicator_httpc.erl b/src/couch_replicator/src/couch_replicator_httpc.erl
   index e4cf11606..576285983 100644
   --- a/src/couch_replicator/src/couch_replicator_httpc.erl
   +++ b/src/couch_replicator/src/couch_replicator_httpc.erl
   @@ -47,10 +47,11 @@ setup(Db) ->
            http_connections = MaxConns,
            proxy_url = ProxyURL
        } = Db,
   -    HttpcURL = case ProxyURL of
   -        undefined -> Url;
   -        _ when is_list(ProxyURL) -> ProxyURL
   -    end,
   +    % HttpcURL = case ProxyURL of
   +    %     undefined -> Url;
   +    %     _ when is_list(ProxyURL) -> ProxyURL
   +    % end,
   +    HttpcURL = Url,
        {ok, Pid} = couch_replicator_httpc_pool:start_link(HttpcURL,
            [{max_connections, MaxConns}]),
        case couch_replicator_auth:initialize(Db#httpdb{httpc_pool = Pid}) of
   ```
   
   This did allow me to get the replication working. It's not a complete fix; if a server had different replications with a given host:port as the final endpoint, but some of them used a proxy and some did not, this code would still mix them together.
   
   A more complete fix would be to expand the key used in the  `couch_replicator_connection` module to include more attributes besides just a single URL.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services