You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@daffodil.apache.org by GitBox <gi...@apache.org> on 2022/08/04 19:55:31 UTC

[GitHub] [daffodil] stevedlawrence commented on a diff in pull request #819: Support XML strings in XMLTextInfosetInputter/Outputter

stevedlawrence commented on code in PR #819:
URL: https://github.com/apache/daffodil/pull/819#discussion_r938196120


##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/infoset/XMLTextInfosetInputter.scala:
##########
@@ -17,24 +17,27 @@
 
 package org.apache.daffodil.infoset
 
+import java.io.StringWriter
+import java.nio.charset.StandardCharsets
+import javax.xml.XMLConstants
+import javax.xml.stream.XMLInputFactory
+import javax.xml.stream.XMLStreamConstants._
+import javax.xml.stream.XMLStreamException
+import javax.xml.stream.XMLStreamReader
+import javax.xml.stream.XMLStreamWriter
+import javax.xml.stream.util.XMLEventAllocator
+
+import org.apache.daffodil.dpath.NodeInfo
 import org.apache.daffodil.exceptions.Assert
+import org.apache.daffodil.infoset.InfosetInputterEventType._
+import org.apache.daffodil.util.MaybeBoolean
 import org.apache.daffodil.util.Misc
 import org.apache.daffodil.xml.XMLUtils
-import org.apache.daffodil.util.MaybeBoolean
-import org.apache.daffodil.dpath.NodeInfo
-import org.apache.daffodil.infoset.InfosetInputterEventType._
-
-import javax.xml.stream.XMLStreamReader
-import javax.xml.stream.XMLStreamConstants._
-import javax.xml.stream.XMLInputFactory
-import javax.xml.stream.util.XMLEventAllocator
-import javax.xml.stream.XMLStreamException
-import javax.xml.XMLConstants
 
-object XMLTextInfosetInputter {
+object XMLTextInfoset {
   lazy val xmlInputFactory = {
     val fact = new com.ctc.wstx.stax.WstxInputFactory()
-    fact.setProperty(XMLInputFactory.IS_COALESCING, true)
+    fact.setProperty(XMLInputFactory.IS_COALESCING, false)

Review Comment:
   As I started the implementation I found that coalescing=true caused things like CDATA/whitespace/etc to get lost. So this was an early attempt to get things as close as possible to not changing on unparse. I've since determined that's very difficult for the unparse XML to be different than the original XML for many reasons (this just being one of them), so this could be reverted without problem.
   
   But for normal infoset processing, the property doesn't actually make a difference, since we use `getText()` which does coalesce the simple content regardless of this property.
   
   So in most cases this property doesn't matter, and if set to false we git a slightly more accurate XML to String conversion on unparse.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org