You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hc.apache.org by GitBox <gi...@apache.org> on 2021/03/24 14:21:01 UTC

[GitHub] [httpcomponents-core] carterkozak opened a new pull request #274: HTTPCORE-671: URIBuilder does not precent-encode bracketed IPv6 addre…

carterkozak opened a new pull request #274:
URL: https://github.com/apache/httpcomponents-core/pull/274


   URIBuilder complies with https://tools.ietf.org/html/rfc2732


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
For additional commands, e-mail: dev-help@hc.apache.org


[GitHub] [httpcomponents-core] ok2c commented on a change in pull request #274: HTTPCORE-671: URIBuilder does not precent-encode bracketed IPv6 addre…

Posted by GitBox <gi...@apache.org>.
ok2c commented on a change in pull request #274:
URL: https://github.com/apache/httpcomponents-core/pull/274#discussion_r602449044



##########
File path: httpcore5/src/main/java/org/apache/hc/core5/net/URIBuilder.java
##########
@@ -292,6 +292,8 @@ private String buildString() {
                 }
                 if (InetAddressUtils.isIPv6Address(this.host)) {
                     sb.append("[").append(this.host).append("]");
+                } else if (InetAddressUtils.isIPv6URLBracketedAddress(this.host)) {

Review comment:
       > This also results in unintuitive and inconsistent behavior due to the fact that `URIBuilder` uses `URI#getRawAuthority` by default and then clears the authority if any part of it changes.
   
   @pkoenig10 I just want to point out that `java.net.URI` does not conform to RFC 3986. It conforms to RFC 2396 only. Behavioral inconsistencies between `java.net.URI` and `URIBuilder` are to be expected.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
For additional commands, e-mail: dev-help@hc.apache.org


[GitHub] [httpcomponents-core] pkoenig10 commented on a change in pull request #274: HTTPCORE-671: URIBuilder does not precent-encode bracketed IPv6 addre…

Posted by GitBox <gi...@apache.org>.
pkoenig10 commented on a change in pull request #274:
URL: https://github.com/apache/httpcomponents-core/pull/274#discussion_r601688891



##########
File path: httpcore5/src/main/java/org/apache/hc/core5/net/URIBuilder.java
##########
@@ -292,6 +292,8 @@ private String buildString() {
                 }
                 if (InetAddressUtils.isIPv6Address(this.host)) {
                     sb.append("[").append(this.host).append("]");
+                } else if (InetAddressUtils.isIPv6URLBracketedAddress(this.host)) {

Review comment:
       This also results in unintuitive and inconsistent behavior due to the fact that  `URIBuilder` uses `URI#getHost` by default and then clears the host if any part of the authority changes.
   
   For example:
   ```
   URI uri = URI.create("http://[::1]/");
   
   // Produces "http://[::1]/"
   new URIBuilder(uri).toString();
   
   // Produces "http://%5B%3A%3A1%5D:1234/"
   new URIBuilder(uri).setPort(1234).toString();
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
For additional commands, e-mail: dev-help@hc.apache.org


[GitHub] [httpcomponents-core] ok2c commented on a change in pull request #274: HTTPCORE-671: URIBuilder does not precent-encode bracketed IPv6 addre…

Posted by GitBox <gi...@apache.org>.
ok2c commented on a change in pull request #274:
URL: https://github.com/apache/httpcomponents-core/pull/274#discussion_r600588526



##########
File path: httpcore5/src/main/java/org/apache/hc/core5/net/InetAddressUtils.java
##########
@@ -137,6 +137,18 @@ public static boolean isIPv6Address(final String input) {
         return isIPv6StdAddress(input) || isIPv6HexCompressedAddress(input);
     }
 
+    /**
+     * Checks whether the parameter is a valid rfc2732 url formatted bracketed IPv6 address (including compressed).

Review comment:
       @carterkozak I would rather avoid referencing RFCs in class names and in javadocs at all. RFCs can get superseded by a newer version and one would have to scan the source for references to outdated documents.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
For additional commands, e-mail: dev-help@hc.apache.org


[GitHub] [httpcomponents-core] carterkozak commented on a change in pull request #274: HTTPCORE-671: URIBuilder does not precent-encode bracketed IPv6 addre…

Posted by GitBox <gi...@apache.org>.
carterkozak commented on a change in pull request #274:
URL: https://github.com/apache/httpcomponents-core/pull/274#discussion_r607111057



##########
File path: httpcore5/src/main/java/org/apache/hc/core5/net/URIBuilder.java
##########
@@ -333,7 +333,12 @@ private void digestURI(final URI uri, final Charset charset) {
         this.scheme = uri.getScheme();
         this.encodedSchemeSpecificPart = uri.getRawSchemeSpecificPart();
         this.encodedAuthority = uri.getRawAuthority();
-        this.host = uri.getHost();
+        final String uriHost = uri.getHost();
+        // URI.getHost incorrectly returns bracketed (encoded) IPv6 values. Brackets are an
+        // encoding detail of the URI and not part of the host string.
+        this.host = uriHost != null && InetAddressUtils.isIPv6URLBracketedAddress(uriHost)

Review comment:
       Update has been pushed.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
For additional commands, e-mail: dev-help@hc.apache.org


[GitHub] [httpcomponents-core] michael-o commented on a change in pull request #274: HTTPCORE-671: URIBuilder does not precent-encode bracketed IPv6 addre…

Posted by GitBox <gi...@apache.org>.
michael-o commented on a change in pull request #274:
URL: https://github.com/apache/httpcomponents-core/pull/274#discussion_r602477244



##########
File path: httpcore5/src/main/java/org/apache/hc/core5/net/URIBuilder.java
##########
@@ -292,6 +292,8 @@ private String buildString() {
                 }
                 if (InetAddressUtils.isIPv6Address(this.host)) {
                     sb.append("[").append(this.host).append("]");
+                } else if (InetAddressUtils.isIPv6URLBracketedAddress(this.host)) {

Review comment:
       @carterkozak Let me crunch on your comments...




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
For additional commands, e-mail: dev-help@hc.apache.org


[GitHub] [httpcomponents-core] carterkozak commented on a change in pull request #274: HTTPCORE-671: URIBuilder does not precent-encode bracketed IPv6 addre…

Posted by GitBox <gi...@apache.org>.
carterkozak commented on a change in pull request #274:
URL: https://github.com/apache/httpcomponents-core/pull/274#discussion_r607095304



##########
File path: httpcore5/src/main/java/org/apache/hc/core5/net/URIBuilder.java
##########
@@ -333,7 +333,12 @@ private void digestURI(final URI uri, final Charset charset) {
         this.scheme = uri.getScheme();
         this.encodedSchemeSpecificPart = uri.getRawSchemeSpecificPart();
         this.encodedAuthority = uri.getRawAuthority();
-        this.host = uri.getHost();
+        final String uriHost = uri.getHost();
+        // URI.getHost incorrectly returns bracketed (encoded) IPv6 values. Brackets are an
+        // encoding detail of the URI and not part of the host string.
+        this.host = uriHost != null && InetAddressUtils.isIPv6URLBracketedAddress(uriHost)

Review comment:
       I don't think so, the ipv6 address as a host value matches the ipv6 pattern which you've pointed out is less ambiguous than the bracketed value (when used in isolation, not within a URI), so there's no reason to match the java `URI.getHost` brackets.

##########
File path: httpcore5/src/test/java/org/apache/hc/core5/net/TestURIBuilder.java
##########
@@ -813,4 +813,32 @@ public void testNormalizeSyntax() throws Exception {
                 new URIBuilder("http:///../../.././").normalizeSyntax().build().toASCIIString());
     }
 
+    @Test
+    public void testIpv6Host() throws Exception {
+        final URI uri = new URIBuilder("https://[::1]:432/path").build();
+        Assert.assertEquals(432, uri.getPort());
+        Assert.assertEquals("https", uri.getScheme());
+        Assert.assertEquals("[::1]", uri.getHost());
+        Assert.assertEquals("/path", uri.getPath());
+    }
+
+    @Test
+    public void testIpv6HostWithPortUpdate() throws Exception {
+        // Updating the port clears URIBuilder.encodedSchemeSpecificPart
+        // and bypasses the fast/simple path which preserves input.
+        final URI uri = new URIBuilder("https://[::1]:432/path").setPort(123).build();
+        Assert.assertEquals(123, uri.getPort());
+        Assert.assertEquals("https", uri.getScheme());
+        Assert.assertEquals("[::1]", uri.getHost());

Review comment:
       There's nothing we can do in this case because we're operating on the constructed `URI` object, not the `URIBuilder`. `URIBuilder.getHost` does return `::1`.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
For additional commands, e-mail: dev-help@hc.apache.org


[GitHub] [httpcomponents-core] carterkozak commented on a change in pull request #274: HTTPCORE-671: URIBuilder does not precent-encode bracketed IPv6 addre…

Posted by GitBox <gi...@apache.org>.
carterkozak commented on a change in pull request #274:
URL: https://github.com/apache/httpcomponents-core/pull/274#discussion_r607103024



##########
File path: httpcore5/src/main/java/org/apache/hc/core5/net/URIBuilder.java
##########
@@ -333,7 +333,12 @@ private void digestURI(final URI uri, final Charset charset) {
         this.scheme = uri.getScheme();
         this.encodedSchemeSpecificPart = uri.getRawSchemeSpecificPart();
         this.encodedAuthority = uri.getRawAuthority();
-        this.host = uri.getHost();
+        final String uriHost = uri.getHost();
+        // URI.getHost incorrectly returns bracketed (encoded) IPv6 values. Brackets are an
+        // encoding detail of the URI and not part of the host string.
+        this.host = uriHost != null && InetAddressUtils.isIPv6URLBracketedAddress(uriHost)

Review comment:
       Oh yes, that does make sense, I will update the change. Sorry for the misinterpretation.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
For additional commands, e-mail: dev-help@hc.apache.org


[GitHub] [httpcomponents-core] carterkozak commented on pull request #274: HTTPCORE-671: URIBuilder does not precent-encode bracketed IPv6 addre…

Posted by GitBox <gi...@apache.org>.
carterkozak commented on pull request #274:
URL: https://github.com/apache/httpcomponents-core/pull/274#issuecomment-806205520


   This should be ready. I will plan to merge tomorrow unless there is additional feedback.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
For additional commands, e-mail: dev-help@hc.apache.org


[GitHub] [httpcomponents-core] garydgregory commented on a change in pull request #274: HTTPCORE-671: URIBuilder does not precent-encode bracketed IPv6 addre…

Posted by GitBox <gi...@apache.org>.
garydgregory commented on a change in pull request #274:
URL: https://github.com/apache/httpcomponents-core/pull/274#discussion_r600575556



##########
File path: httpcore5/src/main/java/org/apache/hc/core5/net/InetAddressUtils.java
##########
@@ -137,6 +137,18 @@ public static boolean isIPv6Address(final String input) {
         return isIPv6StdAddress(input) || isIPv6HexCompressedAddress(input);
     }
 
+    /**
+     * Checks whether the parameter is a valid rfc2732 url formatted bracketed IPv6 address (including compressed).

Review comment:
       URL is an acronym and should be in caps.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
For additional commands, e-mail: dev-help@hc.apache.org


[GitHub] [httpcomponents-core] carterkozak commented on pull request #274: HTTPCORE-671: URIBuilder does not precent-encode bracketed IPv6 addre…

Posted by GitBox <gi...@apache.org>.
carterkozak commented on pull request #274:
URL: https://github.com/apache/httpcomponents-core/pull/274#issuecomment-813455876


   Thank you for bearing with me as well, I appreciate your feedback on the change.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
For additional commands, e-mail: dev-help@hc.apache.org


[GitHub] [httpcomponents-core] ok2c commented on a change in pull request #274: HTTPCORE-671: URIBuilder does not precent-encode bracketed IPv6 addre…

Posted by GitBox <gi...@apache.org>.
ok2c commented on a change in pull request #274:
URL: https://github.com/apache/httpcomponents-core/pull/274#discussion_r602860274



##########
File path: httpcore5/src/main/java/org/apache/hc/core5/net/URIBuilder.java
##########
@@ -292,6 +292,8 @@ private String buildString() {
                 }
                 if (InetAddressUtils.isIPv6Address(this.host)) {
                     sb.append("[").append(this.host).append("]");
+                } else if (InetAddressUtils.isIPv6URLBracketedAddress(this.host)) {

Review comment:
       > Both of your questions are perfectly valid and basically boil down to the question: Shall a URI builder operate (accept) on raw (unencoded) input and hide away all encoding and with getters return raw values or require encoded value already?
   
   @carterkozak @michael-o Folks, whatever you decide to do, `URIBuilder` must operate on raw (unecoded) values internally and encode them upon the final URI production.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
For additional commands, e-mail: dev-help@hc.apache.org


[GitHub] [httpcomponents-core] carterkozak commented on a change in pull request #274: HTTPCORE-671: URIBuilder does not precent-encode bracketed IPv6 addre…

Posted by GitBox <gi...@apache.org>.
carterkozak commented on a change in pull request #274:
URL: https://github.com/apache/httpcomponents-core/pull/274#discussion_r601541841



##########
File path: httpcore5/src/main/java/org/apache/hc/core5/net/URIBuilder.java
##########
@@ -292,6 +292,8 @@ private String buildString() {
                 }
                 if (InetAddressUtils.isIPv6Address(this.host)) {
                     sb.append("[").append(this.host).append("]");
+                } else if (InetAddressUtils.isIPv6URLBracketedAddress(this.host)) {

Review comment:
       This is an interesting point, and I may not have understood enough of https://tools.ietf.org/html/rfc3986 to get this right.
   
   Based on a strict interpretation of the RFC, couldn't `new URIBuilder().setHost("::1").build()` also result in hostname `%3A%3A1`? At some level we need to distinguish IP address inputs from hostnames, in much the same way certificates differentiate SAN values.
   
   I worry about the case in which users use a URIBuilder with data from a java URI object. When ipv6 is used, the `uri.getHost()` method will return a bracketed ip address. I don't think it will be obvious that `new URIBuilder().setScheme("https").setHost(uri.getHost()).build()` will produce a percent-encoded hostname rather than ipv6. I don't think we need to design the URIBuilder to reproduce the java URI's quirks, but anything we can do to help developers avoid unintentional failures will be worthwhile.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
For additional commands, e-mail: dev-help@hc.apache.org


[GitHub] [httpcomponents-core] michael-o commented on pull request #274: HTTPCORE-671: URIBuilder does not precent-encode bracketed IPv6 addre…

Posted by GitBox <gi...@apache.org>.
michael-o commented on pull request #274:
URL: https://github.com/apache/httpcomponents-core/pull/274#issuecomment-813451723


   @carterkozak Thank you for bearing with me.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
For additional commands, e-mail: dev-help@hc.apache.org


[GitHub] [httpcomponents-core] carterkozak commented on a change in pull request #274: HTTPCORE-671: URIBuilder does not precent-encode bracketed IPv6 addre…

Posted by GitBox <gi...@apache.org>.
carterkozak commented on a change in pull request #274:
URL: https://github.com/apache/httpcomponents-core/pull/274#discussion_r602328865



##########
File path: httpcore5/src/main/java/org/apache/hc/core5/net/URIBuilder.java
##########
@@ -292,6 +292,8 @@ private String buildString() {
                 }
                 if (InetAddressUtils.isIPv6Address(this.host)) {
                     sb.append("[").append(this.host).append("]");
+                } else if (InetAddressUtils.isIPv6URLBracketedAddress(this.host)) {

Review comment:
       > So colon matches to pct-encoded, but represents an illegal hostname.
   
   Can you expand upon why `%3A%3A1` represents an illegal reg-name and `%5B%3A%3A1%5D` does not?
   
   The rules defined in RFC 3986 assume either we're parsing a URI from a string value where it has been correctly encoded, or we have data that we have already classified as either in ipv4, ipv6, or host identifier. Unfortunately it doesn't give us much guidance when we have an arbitrary identifier that may or may not need to be encoded based on whether it's an ipv6 address, or a hostname, though if the RFC states some hostnames are illegal/reserved for URI syntax, this may not be an issue (apologies if I have missed something simple).
   
   We could store two values, one string representation of an ip address (v4 or v6) and a second to represent an unencoded hostname, which may be percent encoded when we render the URI.
   
   However this introduces a new problem[1], and fails to solve an existing problem[2]:
   1. Using ipv6, the return value of getHost changes from `[::1]` to `::1`. Consumers are responsible for determining whether the value is an unencoded hostname that must be percent-encoded, or an ipv6 address, just as we are attempting here.
   2. While the system should work properly parsing from a full URI, the `setHost` method is still ambiguous and should probably be split into `setHostIP` and `setHostName` to convey intent. Otherwise we're likely to attempt to resolve hostnames for inputs that the developer expected to be an IP address. I'd much rather throw an exception in this case than send data to the wrong host.
   
   Miscellaneous note: java 14 introduced changes to `InetSocketAddress.toString` which borrow IPv6 URI encoding outside of the URI space, so hosts with square braces parsed from the result of `InetSocketAddress.toString().split(":", 2)[0]` are more likely to be passed to URIBuilder.setHost.
   https://bugs.openjdk.java.net/browse/JDK-8225499
   https://hg.openjdk.java.net/jdk/jdk/rev/eb172a3b1c1c




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
For additional commands, e-mail: dev-help@hc.apache.org


[GitHub] [httpcomponents-core] carterkozak commented on a change in pull request #274: HTTPCORE-671: URIBuilder does not precent-encode bracketed IPv6 addre…

Posted by GitBox <gi...@apache.org>.
carterkozak commented on a change in pull request #274:
URL: https://github.com/apache/httpcomponents-core/pull/274#discussion_r606336900



##########
File path: httpcore5/src/main/java/org/apache/hc/core5/net/URIBuilder.java
##########
@@ -292,6 +292,8 @@ private String buildString() {
                 }
                 if (InetAddressUtils.isIPv6Address(this.host)) {
                     sb.append("[").append(this.host).append("]");
+                } else if (InetAddressUtils.isIPv6URLBracketedAddress(this.host)) {

Review comment:
       @michael-o This is ready for another round of review. I'd appreciate your thoughts on the two comments above when you have the time.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
For additional commands, e-mail: dev-help@hc.apache.org


[GitHub] [httpcomponents-core] michael-o commented on a change in pull request #274: HTTPCORE-671: URIBuilder does not precent-encode bracketed IPv6 addre…

Posted by GitBox <gi...@apache.org>.
michael-o commented on a change in pull request #274:
URL: https://github.com/apache/httpcomponents-core/pull/274#discussion_r602081091



##########
File path: httpcore5/src/main/java/org/apache/hc/core5/net/URIBuilder.java
##########
@@ -292,6 +292,8 @@ private String buildString() {
                 }
                 if (InetAddressUtils.isIPv6Address(this.host)) {
                     sb.append("[").append(this.host).append("]");
+                } else if (InetAddressUtils.isIPv6URLBracketedAddress(this.host)) {

Review comment:
       As for `[::1]` and `::1`. RFC 8200 nowhere mentions the brackets. They are used in URIs to avoid parsing errors. Same in `httpd.conf` and `sshd_config`, etc. We have to differentionate between host subcomponent and host actually. The subcomponent of course can contain brackets and percent-encoded content while the actual reg-name does not. The host production applies to host subcomponents under the restrictions of the URI production only.
   From an API point of view the client does not care about the internal encoding of the builder, but provides raw values, the builder needs to take care. So have the getter work, return Java strings and not escaped ones.
   
   This is my understanding of the host value. I am certain that this is disputed. Maybe one should inquire with the WG.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
For additional commands, e-mail: dev-help@hc.apache.org


[GitHub] [httpcomponents-core] pkoenig10 commented on a change in pull request #274: HTTPCORE-671: URIBuilder does not precent-encode bracketed IPv6 addre…

Posted by GitBox <gi...@apache.org>.
pkoenig10 commented on a change in pull request #274:
URL: https://github.com/apache/httpcomponents-core/pull/274#discussion_r601688891



##########
File path: httpcore5/src/main/java/org/apache/hc/core5/net/URIBuilder.java
##########
@@ -292,6 +292,8 @@ private String buildString() {
                 }
                 if (InetAddressUtils.isIPv6Address(this.host)) {
                     sb.append("[").append(this.host).append("]");
+                } else if (InetAddressUtils.isIPv6URLBracketedAddress(this.host)) {

Review comment:
       This also results in unintuitive and inconsistent behavior due to the fact that  `URIBuilder` uses `URI#getHost` by default, and then clears the host if any part of the authority changes.
   
   For example:
   ```
   URI uri = URI.create("http://[::1]/");
   
   // Produces "http://[::1]/"
   new URIBuilder(uri).toString();
   
   // Produces "http://%5B%3A%3A1%5D:443/"
   new URIBuilder(uri).setPort(1234).toString();
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
For additional commands, e-mail: dev-help@hc.apache.org


[GitHub] [httpcomponents-core] garydgregory commented on a change in pull request #274: HTTPCORE-671: URIBuilder does not precent-encode bracketed IPv6 addre…

Posted by GitBox <gi...@apache.org>.
garydgregory commented on a change in pull request #274:
URL: https://github.com/apache/httpcomponents-core/pull/274#discussion_r600575556



##########
File path: httpcore5/src/main/java/org/apache/hc/core5/net/InetAddressUtils.java
##########
@@ -137,6 +137,18 @@ public static boolean isIPv6Address(final String input) {
         return isIPv6StdAddress(input) || isIPv6HexCompressedAddress(input);
     }
 
+    /**
+     * Checks whether the parameter is a valid rfc2732 url formatted bracketed IPv6 address (including compressed).

Review comment:
       URL is an acronym and should be in caps. Same for RFC.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
For additional commands, e-mail: dev-help@hc.apache.org


[GitHub] [httpcomponents-core] pkoenig10 commented on a change in pull request #274: HTTPCORE-671: URIBuilder does not precent-encode bracketed IPv6 addre…

Posted by GitBox <gi...@apache.org>.
pkoenig10 commented on a change in pull request #274:
URL: https://github.com/apache/httpcomponents-core/pull/274#discussion_r601688891



##########
File path: httpcore5/src/main/java/org/apache/hc/core5/net/URIBuilder.java
##########
@@ -292,6 +292,8 @@ private String buildString() {
                 }
                 if (InetAddressUtils.isIPv6Address(this.host)) {
                     sb.append("[").append(this.host).append("]");
+                } else if (InetAddressUtils.isIPv6URLBracketedAddress(this.host)) {

Review comment:
       This also results in unintuitive and inconsistent behavior due to the fact that  `URIBuilder` uses `URI#getHost` by default and then clears the host if any part of the authority changes.
   
   For example:
   ```
   URI uri = URI.create("http://[::1]/");
   
   // Produces "http://[::1]/"
   new URIBuilder(uri).toString();
   
   // Produces "http://%5B%3A%3A1%5D:443/"
   new URIBuilder(uri).setPort(1234).toString();
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
For additional commands, e-mail: dev-help@hc.apache.org


[GitHub] [httpcomponents-core] carterkozak commented on a change in pull request #274: HTTPCORE-671: URIBuilder does not precent-encode bracketed IPv6 addre…

Posted by GitBox <gi...@apache.org>.
carterkozak commented on a change in pull request #274:
URL: https://github.com/apache/httpcomponents-core/pull/274#discussion_r603406675



##########
File path: httpcore5/src/main/java/org/apache/hc/core5/net/URIBuilder.java
##########
@@ -292,6 +292,8 @@ private String buildString() {
                 }
                 if (InetAddressUtils.isIPv6Address(this.host)) {
                     sb.append("[").append(this.host).append("]");
+                } else if (InetAddressUtils.isIPv6URLBracketedAddress(this.host)) {

Review comment:
       Also note the `Assert.assertEquals("[::1]", uri.getHost());` in my above example must check for a bracketed value due to the URI.getHost incorrect handling of ipv6 addresses -- the `builder.getHost()` would correctly return `::1`.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
For additional commands, e-mail: dev-help@hc.apache.org


[GitHub] [httpcomponents-core] michael-o commented on a change in pull request #274: HTTPCORE-671: URIBuilder does not precent-encode bracketed IPv6 addre…

Posted by GitBox <gi...@apache.org>.
michael-o commented on a change in pull request #274:
URL: https://github.com/apache/httpcomponents-core/pull/274#discussion_r601238087



##########
File path: httpcore5/src/main/java/org/apache/hc/core5/net/URIBuilder.java
##########
@@ -292,6 +292,8 @@ private String buildString() {
                 }
                 if (InetAddressUtils.isIPv6Address(this.host)) {
                     sb.append("[").append(this.host).append("]");
+                } else if (InetAddressUtils.isIPv6URLBracketedAddress(this.host)) {

Review comment:
       I consider that this is logically wrong. The provided host is URI agnostic because the brackets are a URI problem not a host one. For me a host of `[::1]` as input is invalid.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
For additional commands, e-mail: dev-help@hc.apache.org


[GitHub] [httpcomponents-core] carterkozak commented on a change in pull request #274: HTTPCORE-671: URIBuilder does not precent-encode bracketed IPv6 addre…

Posted by GitBox <gi...@apache.org>.
carterkozak commented on a change in pull request #274:
URL: https://github.com/apache/httpcomponents-core/pull/274#discussion_r607110871



##########
File path: httpcore5/src/test/java/org/apache/hc/core5/net/TestURIBuilder.java
##########
@@ -813,4 +813,32 @@ public void testNormalizeSyntax() throws Exception {
                 new URIBuilder("http:///../../.././").normalizeSyntax().build().toASCIIString());
     }
 
+    @Test
+    public void testIpv6Host() throws Exception {
+        final URI uri = new URIBuilder("https://[::1]:432/path").build();
+        Assert.assertEquals(432, uri.getPort());
+        Assert.assertEquals("https", uri.getScheme());
+        Assert.assertEquals("[::1]", uri.getHost());
+        Assert.assertEquals("/path", uri.getPath());
+    }
+
+    @Test
+    public void testIpv6HostWithPortUpdate() throws Exception {
+        // Updating the port clears URIBuilder.encodedSchemeSpecificPart
+        // and bypasses the fast/simple path which preserves input.
+        final URI uri = new URIBuilder("https://[::1]:432/path").setPort(123).build();
+        Assert.assertEquals(123, uri.getPort());
+        Assert.assertEquals("https", uri.getScheme());
+        Assert.assertEquals("[::1]", uri.getHost());

Review comment:
       I've also updated the tests to check URIBuilder getters in addition to URI getters -- this should make the differences between URIBuilder.getHost and URI.getHost more obvious.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
For additional commands, e-mail: dev-help@hc.apache.org


[GitHub] [httpcomponents-core] michael-o commented on a change in pull request #274: HTTPCORE-671: URIBuilder does not precent-encode bracketed IPv6 addre…

Posted by GitBox <gi...@apache.org>.
michael-o commented on a change in pull request #274:
URL: https://github.com/apache/httpcomponents-core/pull/274#discussion_r602789026



##########
File path: httpcore5/src/main/java/org/apache/hc/core5/net/URIBuilder.java
##########
@@ -292,6 +292,8 @@ private String buildString() {
                 }
                 if (InetAddressUtils.isIPv6Address(this.host)) {
                     sb.append("[").append(this.host).append("]");
+                } else if (InetAddressUtils.isIPv6URLBracketedAddress(this.host)) {

Review comment:
       > Can you expand upon why %3A%3A1 represents an illegal reg-name and %5B%3A%3A1%5D does not?
   
   The RFC says
   
         host        = IP-literal / IPv4address / reg-name
        The syntax rule for host is ambiguous because it does not completely
        distinguish between an IPv4address and a reg-name. 
   
   I consider both inputs you have provided to be semantically (not syntactically) illegal. The inputs you have provided do neither match to `IP-literal / IPv4address`. So, it must match `reg-name`, domain names have a fixed syntax. After decoding it is obvious that neither `::1` nor `[::1]` are valid domain names. They are syntactically valid `reg-names`. See [here](https://tools.ietf.org/html/rfc1034#section-3.5). The input is highly ambiguous.
   
   This is an interesting example:
   ```
   $ ifconfig | grep inet6
           inet6 fe80::yyy:9466%em0 prefixlen 64 scopeid 0x1
           inet6 2003:xxx:fed2:9466 prefixlen 64 autoconf
           inet6 ::1 prefixlen 128
           inet6 fe80::1%lo0 prefixlen 64 scopeid 0x2
   $ python3
   Python 3.7.10 (default, Mar 13 2021, 20:44:50)
   [Clang 10.0.1 (git@github.com:llvm/llvm-project.git llvmorg-10.0.1-0-gef32c611a on freebsd12
   Type "help", "copyright", "credits" or "license" for more information.
   >>> import socket
   >>> socket.gethostbyname('::1')
   Traceback (most recent call last):
     File "<stdin>", line 1, in <module>
   socket.gaierror: [Errno 8] Name does not resolve
   >>> socket.gethostbyname('127.0.0.1')
   '127.0.0.1'
   >>> socket.gethostbyname('[::1]')
   Traceback (most recent call last):
     File "<stdin>", line 1, in <module>
   socket.gaierror: [Errno 8] Name does not resolve
   >>> socket.gethostbyname('fe80::yyy:9466')
   Traceback (most recent call last):
     File "<stdin>", line 1, in <module>
   socket.gaierror: [Errno 8] Name does not resolve
   >>> socket.gethostbyname('[fe80::yyy:9466]')
   Traceback (most recent call last):
     File "<stdin>", line 1, in <module>
   socket.gaierror: [Errno 8] Name does not resolve
   >>> socket.gethostbyname('fe80::yyy:9466%em0')
   Traceback (most recent call last):
     File "<stdin>", line 1, in <module>
   socket.gaierror: [Errno 8] Name does not resolve
   >>> socket.gethostbyname('[fe80::yyy:9466%em0]')
   Traceback (most recent call last):
     File "<stdin>", line 1, in <module>
   socket.gaierror: [Errno 8] Name does not resolve
   >>> socket.gethostbyname('[2003:xxx:fed2:9466]')
   Traceback (most recent call last):
     File "<stdin>", line 1, in <module>
   socket.gaierror: [Errno 8] Name does not resolve
   >>> socket.gethostbyname('2003:xxx:fed2:9466')
   Traceback (most recent call last):
     File "<stdin>", line 1, in <module>
   socket.gaierror: [Errno 8] Name does not resolve
   >>> socket.gethostbyname('fe80::1%lo0')
   Traceback (most recent call last):
     File "<stdin>", line 1, in <module>
   socket.gaierror: [Errno 8] Name does not resolve
   >>> socket.gethostbyname('[fe80::1%lo0]')
   Traceback (most recent call last):
     File "<stdin>", line 1, in <module>
   socket.gaierror: [Errno 8] Name does not resolve
   ```
   
   routing table looks good:
   ```
   $ netstat -rn6
   Routing tables
   
   Internet6:
   Destination                       Gateway                       Flags     Netif Expire
   ::/96                             ::1                           UGRS        lo0
   default                           fe80::yyy:2772%em0            UG          em0
   ::1                               link#2                        UH          lo0
   ::ffff:0.0.0.0/96                 ::1                           UGRS        lo0
   xxx::/64                          link#1                        U           em0
   xxx:219:99ff:fed2:9466            link#1                        UHS         lo0
   fe80::/10                         ::1                           UGRS        lo0
   fe80::%em0/64                     link#1                        U           em0
   fe80::219:99ff:fed2:9466%em0      link#1                        UHS         lo0
   fe80::%lo0/64                     link#2                        U           lo0
   fe80::1%lo0                       link#2                        UHS         lo0
   ff02::/16                         ::1                           UGRS        lo0
   ```
   
   But I guess we cannot really restrict non-valid domain names and we shouldn't. Garbage in and garbage out.
   
   > We could store two values, one string representation of an ip address (v4 or v6) and a second to represent an unencoded hostname, which may be percent encoded when we render the URI.
   
   Yes, internally we could (should) work we raw values, but RFC 3986 forces us to encode to help parsers. A technical detail. A `URIBuilder` is an abstraction from the actual syntax, after all.
   
   Both of your questions are perfectly valid and basically boil down to the question: Shall a URI builder operate (accept) on raw (unencoded) input and hide away all encoding and with getters return raw values or require encoded value already? If it requires pre-encoded values, I consider a builder pointless because it would do mere string concat. I want to supply raw values and have the builder to do the hard work for RFC 3986.
   
   Now let's get to the JDK changes: Oracle is, again, embarassing here citing RFC 2732. This one is dead and obsolete. The basic problem they needed to solve is the ambiguity with socket addresses: inet address with ports because IPv6 literals use colons  just like host/port separators. The reason why they "abused" RFC 3986 also there is no URI here is because they did not want to make up their own syntax and reuse an already established one. Note that many other software uses the bracket notation to clearly separate IPv6 from the port. Otherwise how to interprete this inet socket address: `::1`. Is it `::` (any) on port 1 or just `::1` (localhost) with any port information?
   
   JDK-8225499 is a good proof of Oracle's inability to solve problems in time. They should have avoided this with the inception of `Inet6Address` and not 10 years later. Pure BSD/POSIX sockets don't suffer from this issue:
   
   ```
   $ python3
   Python 3.7.10 (default, Mar 13 2021, 20:44:50)
   [Clang 10.0.1 (git@github.com:llvm/llvm-project.git llvmorg-10.0.1-0-gef32c611a on freebsd12
   Type "help", "copyright", "credits" or "license" for more information.
   >>> import socket
   >>> sock = socket.socket(socket.AF_INET6)
   >>> sock
   <socket.socket fd=3, family=AddressFamily.AF_INET6, type=SocketKind.SOCK_STREAM, proto=0, laddr=('::', 0, 0, 0)>
   >>> sock.bind(('::1', 20000))
   >>> sock
   <socket.socket fd=3, family=AddressFamily.AF_INET6, type=SocketKind.SOCK_STREAM, proto=0, laddr=('::1', 20000, 0, 0)>
   >>> sock.listen(1)
   >>> conn, addr = sock.accept() # running `socat(1)`  in another terminal
   >>> addr
   ('::1', 44087, 0, 0)
   ```
   
   ```
   $ echo "HELLO" | socat - 'TCP:[::1]:20000'
   mosipov@bsd1srv:/usr/home/mosipov
   $ echo "HELLO" | socat - 'TCP:::1:20000'
   2021/03/27 21:31:41 socat[61890] E TCP: wrong number of parameters (4 instead of 2)
   ```
   
   Brackets are just a technical necessity.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
For additional commands, e-mail: dev-help@hc.apache.org


[GitHub] [httpcomponents-core] michael-o commented on pull request #274: HTTPCORE-671: URIBuilder does not precent-encode bracketed IPv6 addre…

Posted by GitBox <gi...@apache.org>.
michael-o commented on pull request #274:
URL: https://github.com/apache/httpcomponents-core/pull/274#issuecomment-806481351


   Going through..


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
For additional commands, e-mail: dev-help@hc.apache.org


[GitHub] [httpcomponents-core] carterkozak commented on a change in pull request #274: HTTPCORE-671: URIBuilder does not precent-encode bracketed IPv6 addre…

Posted by GitBox <gi...@apache.org>.
carterkozak commented on a change in pull request #274:
URL: https://github.com/apache/httpcomponents-core/pull/274#discussion_r601541841



##########
File path: httpcore5/src/main/java/org/apache/hc/core5/net/URIBuilder.java
##########
@@ -292,6 +292,8 @@ private String buildString() {
                 }
                 if (InetAddressUtils.isIPv6Address(this.host)) {
                     sb.append("[").append(this.host).append("]");
+                } else if (InetAddressUtils.isIPv6URLBracketedAddress(this.host)) {

Review comment:
       This is an interesting point, and I may not have understood enough of https://tools.ietf.org/html/rfc3986 to get this right.
   
   Based on a strict interpretation of the RFC, couldn't `new URIBuilder().setHost("::1").build()` also result in hostname `%3A%3A1`? At some level we need to distinguish IP address inputs from hostnames, in much the same way certificates differentiate SAN values between IP addresses and DNS.
   
   I worry about the case in which users use a URIBuilder with data from a java URI object. When ipv6 is used, the `uri.getHost()` method will return a bracketed ip address. I don't think it will be obvious that `new URIBuilder().setScheme("https").setHost(uri.getHost()).build()` will produce a percent-encoded hostname rather than ipv6. I don't think we need to design the URIBuilder to reproduce the java URI's quirks, but anything we can do to help developers avoid unintentional failures will be worthwhile.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
For additional commands, e-mail: dev-help@hc.apache.org


[GitHub] [httpcomponents-core] carterkozak commented on a change in pull request #274: HTTPCORE-671: URIBuilder does not precent-encode bracketed IPv6 addre…

Posted by GitBox <gi...@apache.org>.
carterkozak commented on a change in pull request #274:
URL: https://github.com/apache/httpcomponents-core/pull/274#discussion_r600688026



##########
File path: httpcore5/src/main/java/org/apache/hc/core5/net/InetAddressUtils.java
##########
@@ -137,6 +137,18 @@ public static boolean isIPv6Address(final String input) {
         return isIPv6StdAddress(input) || isIPv6HexCompressedAddress(input);
     }
 
+    /**
+     * Checks whether the parameter is a valid rfc2732 url formatted bracketed IPv6 address (including compressed).

Review comment:
       That's fair, I've removed references to the rfc with the exception of a line-comment in the test.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
For additional commands, e-mail: dev-help@hc.apache.org


[GitHub] [httpcomponents-core] carterkozak commented on a change in pull request #274: HTTPCORE-671: URIBuilder does not precent-encode bracketed IPv6 addre…

Posted by GitBox <gi...@apache.org>.
carterkozak commented on a change in pull request #274:
URL: https://github.com/apache/httpcomponents-core/pull/274#discussion_r603389603



##########
File path: httpcore5/src/main/java/org/apache/hc/core5/net/URIBuilder.java
##########
@@ -292,6 +292,8 @@ private String buildString() {
                 }
                 if (InetAddressUtils.isIPv6Address(this.host)) {
                     sb.append("[").append(this.host).append("]");
+                } else if (InetAddressUtils.isIPv6URLBracketedAddress(this.host)) {

Review comment:
       What do we expect in this case?
   
   ```java
       @Test
       public void testBuilderWithBracketedIpv6Host() throws Exception {
           URI firstUri = URI.create("https://[2001:db8::1428:57ab]:123/first");
           URI secondUri = URI.create("https://[::1]:456/second");
           final URI uri = new URIBuilder(firstUri).setHost(secondUri.getHost()).build();
           Assert.assertEquals("https", uri.getScheme());
           Assert.assertEquals(123, uri.getPort()); // Current implementation results in -1
           Assert.assertEquals("[::1]", uri.getHost()); // Current implementation results in null
           Assert.assertEquals("/first", uri.getPath());
       }
   ```
   
   Currently it's "garbage in, garbage out" based on the `secondUri` which is a valid URI unfortunately returning `[::1]` from `secondUri.getHost()`. While we certainly shouldn't require inputs to be encoded already, I think we may wish to either allow inputs to `setHost(String)` to be bracketed and strip the brackets, or throw an exception to avoid passing unexpected garbage to the networking layer. My preference is to normalize `[::1]` to `::1` given that URIBuilder interacts directly with the incorrect `URI` type.
   
   I've push an implementation without any special handling of `setHost(bracketedIpv6String)`, what do you think is best in this case?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
For additional commands, e-mail: dev-help@hc.apache.org


[GitHub] [httpcomponents-core] michael-o commented on a change in pull request #274: HTTPCORE-671: URIBuilder does not precent-encode bracketed IPv6 addre…

Posted by GitBox <gi...@apache.org>.
michael-o commented on a change in pull request #274:
URL: https://github.com/apache/httpcomponents-core/pull/274#discussion_r601655261



##########
File path: httpcore5/src/main/java/org/apache/hc/core5/net/URIBuilder.java
##########
@@ -292,6 +292,8 @@ private String buildString() {
                 }
                 if (InetAddressUtils.isIPv6Address(this.host)) {
                     sb.append("[").append(this.host).append("]");
+                } else if (InetAddressUtils.isIPv6URLBracketedAddress(this.host)) {

Review comment:
       Let me re-read the RFC and get back to you...




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
For additional commands, e-mail: dev-help@hc.apache.org


[GitHub] [httpcomponents-core] carterkozak commented on a change in pull request #274: HTTPCORE-671: URIBuilder does not precent-encode bracketed IPv6 addre…

Posted by GitBox <gi...@apache.org>.
carterkozak commented on a change in pull request #274:
URL: https://github.com/apache/httpcomponents-core/pull/274#discussion_r602911983



##########
File path: httpcore5/src/main/java/org/apache/hc/core5/net/URIBuilder.java
##########
@@ -292,6 +292,8 @@ private String buildString() {
                 }
                 if (InetAddressUtils.isIPv6Address(this.host)) {
                     sb.append("[").append(this.host).append("]");
+                } else if (InetAddressUtils.isIPv6URLBracketedAddress(this.host)) {

Review comment:
       Yep, I think we’re all in agreement. Thanks for the details @michael-o, I’ll put together an updated revision tomorrow.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
For additional commands, e-mail: dev-help@hc.apache.org


[GitHub] [httpcomponents-core] pkoenig10 commented on a change in pull request #274: HTTPCORE-671: URIBuilder does not precent-encode bracketed IPv6 addre…

Posted by GitBox <gi...@apache.org>.
pkoenig10 commented on a change in pull request #274:
URL: https://github.com/apache/httpcomponents-core/pull/274#discussion_r601688891



##########
File path: httpcore5/src/main/java/org/apache/hc/core5/net/URIBuilder.java
##########
@@ -292,6 +292,8 @@ private String buildString() {
                 }
                 if (InetAddressUtils.isIPv6Address(this.host)) {
                     sb.append("[").append(this.host).append("]");
+                } else if (InetAddressUtils.isIPv6URLBracketedAddress(this.host)) {

Review comment:
       This also results in unintuitive and inconsistent behavior due to the fact that  `URIBuilder` uses `URI#getRawAuthority` by default and then clears the authority if any part of it changes.
   
   For example:
   ```
   URI uri = URI.create("http://[::1]/");
   
   // Produces "http://[::1]/"
   new URIBuilder(uri).toString();
   
   // Produces "http://%5B%3A%3A1%5D:1234/"
   new URIBuilder(uri).setPort(1234).toString();
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
For additional commands, e-mail: dev-help@hc.apache.org


[GitHub] [httpcomponents-core] carterkozak commented on a change in pull request #274: HTTPCORE-671: URIBuilder does not precent-encode bracketed IPv6 addre…

Posted by GitBox <gi...@apache.org>.
carterkozak commented on a change in pull request #274:
URL: https://github.com/apache/httpcomponents-core/pull/274#discussion_r600578053



##########
File path: httpcore5/src/main/java/org/apache/hc/core5/net/InetAddressUtils.java
##########
@@ -137,6 +137,18 @@ public static boolean isIPv6Address(final String input) {
         return isIPv6StdAddress(input) || isIPv6HexCompressedAddress(input);
     }
 
+    /**
+     * Checks whether the parameter is a valid rfc2732 url formatted bracketed IPv6 address (including compressed).

Review comment:
       Good call, updated.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
For additional commands, e-mail: dev-help@hc.apache.org


[GitHub] [httpcomponents-core] michael-o commented on a change in pull request #274: HTTPCORE-671: URIBuilder does not precent-encode bracketed IPv6 addre…

Posted by GitBox <gi...@apache.org>.
michael-o commented on a change in pull request #274:
URL: https://github.com/apache/httpcomponents-core/pull/274#discussion_r602081554



##########
File path: httpcore5/src/main/java/org/apache/hc/core5/net/URIBuilder.java
##########
@@ -292,6 +292,8 @@ private String buildString() {
                 }
                 if (InetAddressUtils.isIPv6Address(this.host)) {
                     sb.append("[").append(this.host).append("]");
+                } else if (InetAddressUtils.isIPv6URLBracketedAddress(this.host)) {

Review comment:
       @pkoening10 based on my understanding the last output is wrong.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
For additional commands, e-mail: dev-help@hc.apache.org


[GitHub] [httpcomponents-core] michael-o commented on a change in pull request #274: HTTPCORE-671: URIBuilder does not precent-encode bracketed IPv6 addre…

Posted by GitBox <gi...@apache.org>.
michael-o commented on a change in pull request #274:
URL: https://github.com/apache/httpcomponents-core/pull/274#discussion_r607101503



##########
File path: httpcore5/src/main/java/org/apache/hc/core5/net/URIBuilder.java
##########
@@ -333,7 +333,12 @@ private void digestURI(final URI uri, final Charset charset) {
         this.scheme = uri.getScheme();
         this.encodedSchemeSpecificPart = uri.getRawSchemeSpecificPart();
         this.encodedAuthority = uri.getRawAuthority();
-        this.host = uri.getHost();
+        final String uriHost = uri.getHost();
+        // URI.getHost incorrectly returns bracketed (encoded) IPv6 values. Brackets are an
+        // encoding detail of the URI and not part of the host string.
+        this.host = uriHost != null && InetAddressUtils.isIPv6URLBracketedAddress(uriHost)

Review comment:
       Sorry, I guess my comment wasn't clear enough: Should the method's Javadoc explicitly say that it does not correspond to the behavior of `URI#getHost()` as it it will return raw values.
   
   Does this compute for you?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
For additional commands, e-mail: dev-help@hc.apache.org


[GitHub] [httpcomponents-core] carterkozak merged pull request #274: HTTPCORE-671: URIBuilder does not precent-encode bracketed IPv6 addre…

Posted by GitBox <gi...@apache.org>.
carterkozak merged pull request #274:
URL: https://github.com/apache/httpcomponents-core/pull/274


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
For additional commands, e-mail: dev-help@hc.apache.org


[GitHub] [httpcomponents-core] michael-o commented on a change in pull request #274: HTTPCORE-671: URIBuilder does not precent-encode bracketed IPv6 addre…

Posted by GitBox <gi...@apache.org>.
michael-o commented on a change in pull request #274:
URL: https://github.com/apache/httpcomponents-core/pull/274#discussion_r607035068



##########
File path: httpcore5/src/main/java/org/apache/hc/core5/net/URIBuilder.java
##########
@@ -333,7 +333,12 @@ private void digestURI(final URI uri, final Charset charset) {
         this.scheme = uri.getScheme();
         this.encodedSchemeSpecificPart = uri.getRawSchemeSpecificPart();
         this.encodedAuthority = uri.getRawAuthority();
-        this.host = uri.getHost();
+        final String uriHost = uri.getHost();
+        // URI.getHost incorrectly returns bracketed (encoded) IPv6 values. Brackets are an
+        // encoding detail of the URI and not part of the host string.
+        this.host = uriHost != null && InetAddressUtils.isIPv6URLBracketedAddress(uriHost)

Review comment:
       Very nice!

##########
File path: httpcore5/src/test/java/org/apache/hc/core5/net/TestURIBuilder.java
##########
@@ -813,4 +813,32 @@ public void testNormalizeSyntax() throws Exception {
                 new URIBuilder("http:///../../.././").normalizeSyntax().build().toASCIIString());
     }
 
+    @Test
+    public void testIpv6Host() throws Exception {
+        final URI uri = new URIBuilder("https://[::1]:432/path").build();
+        Assert.assertEquals(432, uri.getPort());
+        Assert.assertEquals("https", uri.getScheme());
+        Assert.assertEquals("[::1]", uri.getHost());
+        Assert.assertEquals("/path", uri.getPath());
+    }
+
+    @Test
+    public void testIpv6HostWithPortUpdate() throws Exception {
+        // Updating the port clears URIBuilder.encodedSchemeSpecificPart
+        // and bypasses the fast/simple path which preserves input.
+        final URI uri = new URIBuilder("https://[::1]:432/path").setPort(123).build();
+        Assert.assertEquals(123, uri.getPort());
+        Assert.assertEquals("https", uri.getScheme());
+        Assert.assertEquals("[::1]", uri.getHost());

Review comment:
       See my comment above.

##########
File path: httpcore5/src/main/java/org/apache/hc/core5/net/URIBuilder.java
##########
@@ -333,7 +333,12 @@ private void digestURI(final URI uri, final Charset charset) {
         this.scheme = uri.getScheme();
         this.encodedSchemeSpecificPart = uri.getRawSchemeSpecificPart();
         this.encodedAuthority = uri.getRawAuthority();
-        this.host = uri.getHost();
+        final String uriHost = uri.getHost();
+        // URI.getHost incorrectly returns bracketed (encoded) IPv6 values. Brackets are an
+        // encoding detail of the URI and not part of the host string.
+        this.host = uriHost != null && InetAddressUtils.isIPv6URLBracketedAddress(uriHost)

Review comment:
       Question: Should `#getHost()` contain information that is unwraps IPv6 addresses?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
For additional commands, e-mail: dev-help@hc.apache.org


[GitHub] [httpcomponents-core] michael-o commented on a change in pull request #274: HTTPCORE-671: URIBuilder does not precent-encode bracketed IPv6 addre…

Posted by GitBox <gi...@apache.org>.
michael-o commented on a change in pull request #274:
URL: https://github.com/apache/httpcomponents-core/pull/274#discussion_r602077643



##########
File path: httpcore5/src/main/java/org/apache/hc/core5/net/URIBuilder.java
##########
@@ -292,6 +292,8 @@ private String buildString() {
                 }
                 if (InetAddressUtils.isIPv6Address(this.host)) {
                     sb.append("[").append(this.host).append("]");
+                } else if (InetAddressUtils.isIPv6URLBracketedAddress(this.host)) {

Review comment:
       First of all `%3A%3A1` is invalid. See the production for the host components:
   > host        = IP-literal / IPv4address / reg-name
   Neither `IP-literal`, nor `IPv4address` allow PCT encoded so these productions are skipped. Remains `reg-name`:
   > reg-name    = *( unreserved / pct-encoded / sub-delims )
   
   So colon matches to `pct-encoded`, but represents an illegal hostname.
   Also note:
   > This syntax does not support IPv6 scoped addressing zone identifiers.
   
   Also this one:
   
   >    The syntax rule for host is ambiguous because it does not completely
      distinguish between an IPv4address and a reg-name.  In order to
      disambiguate the syntax, we apply the "first-match-wins" algorithm:
      If host matches the rule for IPv4address, then it should be
      considered an IPv4 address literal and not a reg-name.
   
   Thanks my understanding of the RFC.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
For additional commands, e-mail: dev-help@hc.apache.org


[GitHub] [httpcomponents-core] carterkozak commented on a change in pull request #274: HTTPCORE-671: URIBuilder does not precent-encode bracketed IPv6 addre…

Posted by GitBox <gi...@apache.org>.
carterkozak commented on a change in pull request #274:
URL: https://github.com/apache/httpcomponents-core/pull/274#discussion_r602333014



##########
File path: httpcore5/src/main/java/org/apache/hc/core5/net/URIBuilder.java
##########
@@ -292,6 +292,8 @@ private String buildString() {
                 }
                 if (InetAddressUtils.isIPv6Address(this.host)) {
                     sb.append("[").append(this.host).append("]");
+                } else if (InetAddressUtils.isIPv6URLBracketedAddress(this.host)) {

Review comment:
       I also want to thank you for working through this with me, @michael-o. I hope I'm not coming across as pedantic, my intent is to better understand the problem (you're much more familiar with this RFC than I), and come to a safe, maintainable solution.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
For additional commands, e-mail: dev-help@hc.apache.org


[GitHub] [httpcomponents-core] michael-o commented on a change in pull request #274: HTTPCORE-671: URIBuilder does not precent-encode bracketed IPv6 addre…

Posted by GitBox <gi...@apache.org>.
michael-o commented on a change in pull request #274:
URL: https://github.com/apache/httpcomponents-core/pull/274#discussion_r602869674



##########
File path: httpcore5/src/main/java/org/apache/hc/core5/net/URIBuilder.java
##########
@@ -292,6 +292,8 @@ private String buildString() {
                 }
                 if (InetAddressUtils.isIPv6Address(this.host)) {
                     sb.append("[").append(this.host).append("]");
+                } else if (InetAddressUtils.isIPv6URLBracketedAddress(this.host)) {

Review comment:
       > 
   > 
   > > Both of your questions are perfectly valid and basically boil down to the question: Shall a URI builder operate (accept) on raw (unencoded) input and hide away all encoding and with getters return raw values or require encoded value already?
   > 
   > @carterkozak @michael-o Folks, whatever you decide to do, `URIBuilder` must operate on raw (unecoded) values internally and encode them upon the final URI production.
   
   @ok2c I absolutely do *not* intend to change that because this is the right approach to do here.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
For additional commands, e-mail: dev-help@hc.apache.org


[GitHub] [httpcomponents-core] michael-o commented on a change in pull request #274: HTTPCORE-671: URIBuilder does not precent-encode bracketed IPv6 addre…

Posted by GitBox <gi...@apache.org>.
michael-o commented on a change in pull request #274:
URL: https://github.com/apache/httpcomponents-core/pull/274#discussion_r607101503



##########
File path: httpcore5/src/main/java/org/apache/hc/core5/net/URIBuilder.java
##########
@@ -333,7 +333,12 @@ private void digestURI(final URI uri, final Charset charset) {
         this.scheme = uri.getScheme();
         this.encodedSchemeSpecificPart = uri.getRawSchemeSpecificPart();
         this.encodedAuthority = uri.getRawAuthority();
-        this.host = uri.getHost();
+        final String uriHost = uri.getHost();
+        // URI.getHost incorrectly returns bracketed (encoded) IPv6 values. Brackets are an
+        // encoding detail of the URI and not part of the host string.
+        this.host = uriHost != null && InetAddressUtils.isIPv6URLBracketedAddress(uriHost)

Review comment:
       Sorry, I guess my comment wasn't clear enough: Should the method's Javadoc explicitly say that it does not correspond to the behavior of `URI#getHost()` as it it will return raw values?
   
   Does this compute for you?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
For additional commands, e-mail: dev-help@hc.apache.org


[GitHub] [httpcomponents-core] michael-o commented on a change in pull request #274: HTTPCORE-671: URIBuilder does not precent-encode bracketed IPv6 addre…

Posted by GitBox <gi...@apache.org>.
michael-o commented on a change in pull request #274:
URL: https://github.com/apache/httpcomponents-core/pull/274#discussion_r606355027



##########
File path: httpcore5/src/main/java/org/apache/hc/core5/net/URIBuilder.java
##########
@@ -292,6 +292,8 @@ private String buildString() {
                 }
                 if (InetAddressUtils.isIPv6Address(this.host)) {
                     sb.append("[").append(this.host).append("]");
+                } else if (InetAddressUtils.isIPv6URLBracketedAddress(this.host)) {

Review comment:
       Thank you, will pick up.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
For additional commands, e-mail: dev-help@hc.apache.org