You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@cxf.apache.org by GitBox <gi...@apache.org> on 2022/11/15 18:15:45 UTC

[GitHub] [cxf] neseleznev opened a new pull request, #1028: Fix/CXF-8698.2 Unxepected URLEncode in MTOM Content-Id

neseleznev opened a new pull request, #1028:
URL: https://github.com/apache/cxf/pull/1028

   Related to https://github.com/apache/cxf/pull/950 and https://github.com/apache/cxf/pull/993
   
   ## Problem
   The cxf version `3.5.3` introduced bug which results in invalid MTOM requests.
   More precisely, the version highlighted an older bug, I'll elaborate on it :)
   
   ## Investigation
   
   Commit https://github.com/apache/cxf/commit/ffba34eed2d5b4af22a93c100e4687e234d53b28#diff-e3efb80d0a98bbbd7f6eddd3c021c5fb5ab05ea2ee8d97dc68026f6345e5a509 by @reta had changed how `Content-Id` is being dumped to headers.
   First of all, thank you for the bold point of doing this, referring to the RFCs. 
   Let's have a look at the line 243 in particular
   
   Provided that `attachmentId` is of format `uuid@domain` it works as exepected, however, `attachmentId` is being generated by CXF in routine https://github.com/apache/cxf/blob/2ad9d0b2eef17c0d57d3cb96f3b2cecd1e704869/core/src/main/java/org/apache/cxf/attachment/AttachmentUtil.java#L230 which results in `uuid@urn:xml:namespace` on some inputs.
   This input leads to the Header being URL encoded.
   
   Issues with this header are known for a while https://issues.apache.org/jira/browse/CXF-2669 
   What's important is how do the SOAP servers treat URL-encoded `Content-Id`.
   In my experience, IRS.gov does not match 
   ```
   Content-ID: <3315f978-0190-4bc2-8a97-f766a78a7946-1@urn%3Aus%3Agov%3Atreasury%3Airs%3Acommon>
   ```
   with previously defined reference
   ```
   <xop:Include xmlns:xop="http://www.w3.org/2004/08/xop/include" href="cid:3315f978-0190-4bc2-8a97-f766a78a7946-1@urn%3Aus%3Agov%3Atreasury%3Airs%3Acommon"/>
   ```
   which is basically the same and _should_ match.
   
   That said, it's well-known issue in the wild
   1. https://access.redhat.com/solutions/2062163
   2. https://access.redhat.com/solutions/4076871
   
   The latter points to the fact that there should be no URL-encoded symbols in `Content-Id`, which is met by @reta's commit.
   
   ## The Fix
   
   The problem is in `AttachmentUtil::createContentID`, so I've fixed the `Content-Id` generation to be more strict and use safe fallback value in cases of unmet domain pattern.
   The buggy method uses `new URI(...).getHost()` to extract domain, which is not the domain we expect to put in Content ID. 
   Namely, `URI::getHost` javadoc indicates:
   ```
   An IPv6 address enclosed in square brackets ('[' and ']') and consisting of hexadecimal digits, colon characters (':'), and possibly an embedded IPv4 address. The full syntax of IPv6 addresses is specified in RFC 2373: IPv6 Addressing Architecture.
   ```
   
   Thus, I've also added few tests, which include IPv6 (just in case :) ), and looking at those you may ensure the implications of the fix
   
   ## Prolog
   
   I'm new to Apache, and I'm not sure that this is the proper way to post the bug report.
   I wasn't able to log in to Apache's JIRA, so I decided to go with fix to speed things up.
   You may force-push / rebranch / rewrite / throw my commits away, but I'd be happy so long as the fix would be accepted
   
   As of now, I've rolled back cxf version to `3.5.3` and it works as expected. Provided that the upgrade has followed Java 17 migration, the same could happen to all the projects willing to use Java 17 along with CXF & MTOM functionality


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@cxf.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] Fix/CXF-8698.2 Unxepected URLEncode in MTOM Content-Id [cxf]

Posted by "neseleznev (via GitHub)" <gi...@apache.org>.
neseleznev commented on PR #1028:
URL: https://github.com/apache/cxf/pull/1028#issuecomment-1795737308

   > could you please confirm it solves the regression?
   
   Yes, thank you, this solved the issue.
   Better late than never :)
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@cxf.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [cxf] neseleznev commented on pull request #1028: Fix/CXF-8698.2 Unxepected URLEncode in MTOM Content-Id

Posted by GitBox <gi...@apache.org>.
neseleznev commented on PR #1028:
URL: https://github.com/apache/cxf/pull/1028#issuecomment-1318210158

   > Also could you please try 3.5.4, we fix one of the  `cid` regressions
   
   Yes, 3.5.4 has this defect as well


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@cxf.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [cxf] reta commented on pull request #1028: Fix/CXF-8698.2 Unxepected URLEncode in MTOM Content-Id

Posted by GitBox <gi...@apache.org>.
reta commented on PR #1028:
URL: https://github.com/apache/cxf/pull/1028#issuecomment-1328150629

   > > Also could you please try 3.5.4, we fix one of the  `cid` regressions
   > 
   > Yes, 3.5.4 has this defect as well
   
   Apologies for delay @neseleznev , the best option I came up with looking into the regression is to make `AttachmentSerializer` account for the fact that domain part could be already encoded (by trying to decode it):
   
   ```
                   String[] address = attachmentId.split("@", 2);
                   if (address.length == 2) {
                       // See please AttachmentUtil::createContentID, the domain part is URL encoded, try to decode it
                       final String decoded = tryDecode(address[1], StandardCharsets.UTF_8);
                       if (!decoded.equalsIgnoreCase(address[1])) {
                           writer.write(address[0] + "@" + decoded);
                       } else {
                           writer.write(attachmentId);
                       }
                   } else {
                       writer.write(URLEncoder.encode(attachmentId, StandardCharsets.UTF_8.name()));
                   }
   ``` 
   
   I would really prefer not to touch the createContentID (high risk to break things). I think it solves your problem, what do you think? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@cxf.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [cxf] reta commented on pull request #1028: Fix/CXF-8698.2 Unxepected URLEncode in MTOM Content-Id

Posted by GitBox <gi...@apache.org>.
reta commented on PR #1028:
URL: https://github.com/apache/cxf/pull/1028#issuecomment-1328330312

   > As I told you, you may do anything that is needed with my contribution — would it be a force push, or close the PR, I'd be happy as long as we fix it
   
   Pushed the change, could you please confirm it solves the regression? (I've added more test cases, seem to be fixing it), thank you


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@cxf.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [cxf] neseleznev commented on pull request #1028: Fix/CXF-8698.2 Unxepected URLEncode in MTOM Content-Id

Posted by GitBox <gi...@apache.org>.
neseleznev commented on PR #1028:
URL: https://github.com/apache/cxf/pull/1028#issuecomment-1328308378

   @reta 
   Personally, I'd better fix the reason rather than symptoms. As of now, `createContentID` creates ContentID (🙂), which is malformed by RFC definition.
   Sadly, this method is `public static`, thus should be considered as public API the library. As a consequence, it could be used in third-party libraries, which rely on the current behavior, so you are right, it'd be safer to leave it as it.
   My conclusions could be exaggerated, so you guys know better how `cxf` is being used in the wild, and decision is up to you


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@cxf.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [cxf] reta commented on pull request #1028: Fix/CXF-8698.2 Unxepected URLEncode in MTOM Content-Id

Posted by GitBox <gi...@apache.org>.
reta commented on PR #1028:
URL: https://github.com/apache/cxf/pull/1028#issuecomment-1328318747

   > Sure thing @reta. I'm glad you found out more stable solution. I keep an eye on new version with the fix deployed, thank you
   
   I have opened a regression ticket https://issues.apache.org/jira/browse/CXF-8799 to track the problem, do you prefer me to open another pull request or suggest push the suggested changes to this pull request (if you have no objections)?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@cxf.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [cxf] neseleznev commented on pull request #1028: Fix/CXF-8698.2 Unxepected URLEncode in MTOM Content-Id

Posted by GitBox <gi...@apache.org>.
neseleznev commented on PR #1028:
URL: https://github.com/apache/cxf/pull/1028#issuecomment-1315721025

   It could be that [apache:3.6.x-fixes](https://github.com/apache/cxf/tree/3.6.x-fixes) is a proper base for this change, so if you want me to rebase the branch, don't hesitate to ask


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@cxf.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [cxf] dkulp merged pull request #1028: Fix/CXF-8698.2 Unxepected URLEncode in MTOM Content-Id

Posted by GitBox <gi...@apache.org>.
dkulp merged PR #1028:
URL: https://github.com/apache/cxf/pull/1028


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@cxf.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [cxf] neseleznev commented on pull request #1028: Fix/CXF-8698.2 Unxepected URLEncode in MTOM Content-Id

Posted by GitBox <gi...@apache.org>.
neseleznev commented on PR #1028:
URL: https://github.com/apache/cxf/pull/1028#issuecomment-1328326469

   > > Sure thing @reta. I'm glad you found out more stable solution. I keep an eye on new version with the fix deployed, thank you
   > 
   > I have opened a regression ticket https://issues.apache.org/jira/browse/CXF-8799 to track the problem, do you prefer me to open another pull request or push the suggested changes to this pull request (if you have no objections)?
   
   As I told you, you may do anything that is needed with my contribution — would it be a force push, or close the PR, I'd be happy as long as we fix it


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@cxf.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [cxf] reta commented on pull request #1028: Fix/CXF-8698.2 Unxepected URLEncode in MTOM Content-Id

Posted by GitBox <gi...@apache.org>.
reta commented on PR #1028:
URL: https://github.com/apache/cxf/pull/1028#issuecomment-1317906032

   Also could you please try 3.5.4, we fix one of the  `cid` regressions


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@cxf.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [cxf] reta commented on pull request #1028: Fix/CXF-8698.2 Unxepected URLEncode in MTOM Content-Id

Posted by GitBox <gi...@apache.org>.
reta commented on PR #1028:
URL: https://github.com/apache/cxf/pull/1028#issuecomment-1317891379

   @neseleznev thanks a lot for very detailed and informative issue description (it seems like we've opened the Pandora box by fixing the `cid` encoding).  I think I understand the problem, mind if I take some time to think about it, your solution is definitely one way of solving thing but I am afraid we may have edge cases here. 
   
   Also, targeting `main` is fine, we backport all relevant fixed to maintainable release branches.
   Thank you.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@cxf.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [cxf] reta commented on pull request #1028: Fix/CXF-8698.2 Unxepected URLEncode in MTOM Content-Id

Posted by GitBox <gi...@apache.org>.
reta commented on PR #1028:
URL: https://github.com/apache/cxf/pull/1028#issuecomment-1328317923

   > Sadly, this method is `public static`, thus should be considered as public API of the library. As a consequence, it could be used in third-party libraries, which rely on the current behavior, so you are right, it'd be safer to leave it as it.
   
   This is exactly the concerns I run into, even such a small change as `CXF-8698` (`cid` by spec) resulted in number of unexpected regressions, thank you for understanding.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@cxf.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [cxf] neseleznev commented on pull request #1028: Fix/CXF-8698.2 Unxepected URLEncode in MTOM Content-Id

Posted by GitBox <gi...@apache.org>.
neseleznev commented on PR #1028:
URL: https://github.com/apache/cxf/pull/1028#issuecomment-1328318339

   Sure thing @reta. I'm glad you found out more stable solution. I keep an eye on new version with the fix deployed, thank you


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@cxf.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org