You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@ant.apache.org by GitBox <gi...@apache.org> on 2022/01/24 15:26:16 UTC
[GitHub] [ant] tsmock commented on a change in pull request #175: junitlauncher - Fix an issue in LegacyXmlResultFormatter with ]]> in stacktraces

tsmock commented on a change in pull request #175:
URL: https://github.com/apache/ant/pull/175#discussion_r790853748



##########
File path: src/main/org/apache/tools/ant/taskdefs/optional/junitlauncher/LegacyXmlResultFormatter.java
##########
@@ -298,8 +298,9 @@ private void writeFailed(final XMLStreamWriter writer, final TestIdentifier test
                     writeAttribute(writer, ATTR_MESSAGE, message);
                 }
                 writeAttribute(writer, ATTR_TYPE, t.getClass().getName());
-                // write out the stacktrace
-                writer.writeCData(StringUtils.getStackTrace(t));
+                // write out the stacktrace, replacing any CDATA endings by splitting them into multiple CDATA segments
+                // See https://bz.apache.org/bugzilla/show_bug.cgi?id=65833
+                writer.writeCData(StringUtils.getStackTrace(t).replaceAll("]]>", "]]]]><![CDATA[>"));

Review comment:
       The `encode` method currently only handles special characters (generic XML). This handles a special _sequence_ for a specific data section. `]]>`.
   
   From [1], "Within a CDATA section, only the CDEnd string is recognized as markup, so that left angle brackets and ampersands may occur in their literal form; they need not (and cannot) be escaped using " &lt; " and " &amp; ". CDATA sections cannot nest." (CDEnd === `]]>`)
   
   This means that replacing `>` with `&gt;` would not work, as CDATA parsers shouldn't treat `&gt; === >`. They should treat it as a _literal_ `&gt;` (since the & is literal), which can break many usages of the stacktrace. Sure, people could have regexes to fix the stack trace, but that is duplicate work for many people.
   
   With all that said, `writeCData` should properly handle text with `]]>` in it, but since people have had to work around it, it may no longer be possible to fix the issue in the actual implementing libraries without a lot of work. How would it know if you were working around this issue, and _not_ escape the cdata sequence again? (`]]>` -> `]]]]><!CDATA[>` -> `]]]]]]><!CDATA[><!CDATA[>`).
   Example pitfall: A test failure outputs XML with a CDATA section, which is then written to CDATA in LegacyXmlResultFormatter. We want to keep the XML output as is, which means escaping the interior CDATA endings. If this then happens again in the `writeCData` method, then the xml output shown to the user will most likely be wrong. From the above example, instead of `]]>`, the user would see `]]]]><!CDATA[>` unless the CDATA parser recursively parses the data (unlikely IMO).
   
   [1] https://www.w3.org/TR/REC-xml/#dt-cdsection




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@ant.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@ant.apache.org
For additional commands, e-mail: dev-help@ant.apache.org