You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@daffodil.apache.org by "mike-mcgann (via GitHub)" <gi...@apache.org> on 2023/03/08 15:00:44 UTC

[GitHub] [daffodil] mike-mcgann opened a new pull request, #981: Update parsing when emptyValueDelimiterPolicy is 'none'

mike-mcgann opened a new pull request, #981:
URL: https://github.com/apache/daffodil/pull/981

   When a value is empty, has an initiator and terminator defined, and the emptyValueDelimiterPolicy is set to 'none', the initiator and terminator should not be found in the input stream and should not be emitted when unparsing. Daffodil was raising a parse error in this case. This update allows parsing to continue when the initiator or terminator are not required to appear.
   
   [DAFFODIL-2205](https://issues.apache.org/jira/browse/DAFFODIL-2205)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [daffodil] stevedlawrence commented on a diff in pull request #981: Update parsing when emptyValueDelimiterPolicy is 'none'

Posted by "stevedlawrence (via GitHub)" <gi...@apache.org>.
stevedlawrence commented on code in PR #981:
URL: https://github.com/apache/daffodil/pull/981#discussion_r1129835842


##########
daffodil-core/src/main/scala/org/apache/daffodil/core/grammar/primitives/PrimitivesDelimiters.scala:
##########
@@ -68,15 +69,33 @@ abstract class DelimiterText(
     case _ => false
   }
 
+  // When the value is known to be empty, is this delimiter required to
+  // be present according to the emptyValueDelimiterPolicy
+  private val requiredOnEmptyValue = e match {
+    case elemB: ElementBase => elemB.isDelimiterRequiredWhenEmpty(delimiterType)
+    case _ => true
+  }

Review Comment:
   With the above suggestion, this just becomes a parameter passed into the `DelimiterText` constructor.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [daffodil] mike-mcgann commented on a diff in pull request #981: Update parsing when emptyValueDelimiterPolicy is 'none'

Posted by "mike-mcgann (via GitHub)" <gi...@apache.org>.
mike-mcgann commented on code in PR #981:
URL: https://github.com/apache/daffodil/pull/981#discussion_r1129979595


##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/runtime1/processors/parsers/DelimiterParsers.scala:
##########
@@ -108,12 +109,14 @@ class DelimiterTextParser(
     if (foundDelimiter.isDefined) {
       if (!containsLocalMatch(foundDelimiter.get.matchedDFAs, start)) {
         // It was a remote delimiter but we should have found a local one.
-        PE(
-          start,
-          "Found out of scope delimiter: '%s' '%s'",
-          foundDelimiter.get.matchedDFAs(0).lookingFor,
-          Misc.remapStringToVisibleGlyphs(foundDelimiter.get.matchedDelimiterValue.get),
-        )
+        if (requiredOnEmptyValue) {

Review Comment:
   This is for the case where the data is `John,(Jacob),Doe` when present and `John,,Doe` when not present having an initiator of `(` and a terminator of `)` for the middle name. When the middle name was not present, Daffodil was raising an error as described in the ticket: `Parse Error: Found out of scope delimiter: ',' ','`. When that block is reached, the existing comment says "It was a remote delimiter but we should have found a local one". I put a breakpoint here and in this case it found a comma while it was looking for the parens. In this case, where the delimiters should be absent when the value is empty, parsing can continue. That seemed to work but I'm not sure if that is 100% correct. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [daffodil] mike-mcgann commented on pull request #981: Update parsing when emptyValueDelimiterPolicy is 'none'

Posted by "mike-mcgann (via GitHub)" <gi...@apache.org>.
mike-mcgann commented on PR #981:
URL: https://github.com/apache/daffodil/pull/981#issuecomment-1462058276

   Ok, I'm closing this PR for now and updating the ticket with this information. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [daffodil] stevedlawrence commented on a diff in pull request #981: Update parsing when emptyValueDelimiterPolicy is 'none'

Posted by "stevedlawrence (via GitHub)" <gi...@apache.org>.
stevedlawrence commented on code in PR #981:
URL: https://github.com/apache/daffodil/pull/981#discussion_r1129833778


##########
daffodil-core/src/main/scala/org/apache/daffodil/core/dsom/LocalElementMixin.scala:
##########
@@ -124,4 +129,21 @@ trait LocalElementMixin extends ParticleMixin with LocalElementGrammarMixin {
     res
   }.value
 
+  final def isDelimiterRequiredWhenEmpty(t: DelimiterTextType.Type): Boolean = {

Review Comment:
   Instead of making this a function that only makes sense for initiator/terminator, it might make sense to make new LVs, e.g.
   
   ```scala
   private lazy val isInitiatorRequiredWhenEmpty = LV(`...`) {
     emptyValueDelimiterPolicy match {
       case EmptyValueDelimiterPolicy.Both => true
       case EmptyValueDelimiterPolicy.Initiator => true
       case => false
     }
   }
   
   private lazy val isTerminatorRequiredWhenEmpty = LV(`...`) {
     ...
   }
   ```
   
   Then the Initiator/Terminator classes can pass in `is<Foo>RequiredWhenEmpty` when extending DelimiterText. This avoids the hasInitiator/Terminator calls since these LV's will only be called if there are initiator/terminators, and avoids unreachable code.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [daffodil] stevedlawrence commented on a diff in pull request #981: Update parsing when emptyValueDelimiterPolicy is 'none'

Posted by "stevedlawrence (via GitHub)" <gi...@apache.org>.
stevedlawrence commented on code in PR #981:
URL: https://github.com/apache/daffodil/pull/981#discussion_r1129820680


##########
daffodil-core/src/main/scala/org/apache/daffodil/core/dsom/LocalElementMixin.scala:
##########
@@ -84,6 +85,10 @@ trait LocalElementMixin extends ParticleMixin with LocalElementGrammarMixin {
       else if (
         isDefaultable && !hasTerminator && emptyValueDelimiterPolicy =:= EmptyValueDelimiterPolicy.Terminator
       ) true
+      else if (
+        hasInitiator && hasTerminator && emptyValueDelimiterPolicy =:= EmptyValueDelimiterPolicy.None

Review Comment:
   I'm not sure I understand this? Seems having an initiator/terminator shouldn't matter if `emptyValueDelimiterPolicy="none"`?
   
   I wonder if the `isDefaultable && emptyValueDelimiterPolicy =:= EmptyValueDelimiterPolicy.None` check is incorrect? I'm not sure why isDefaultable matters when determining if something could be missing. Or maybe a check in `isDefaultable` is incorrect?



##########
daffodil-core/src/main/scala/org/apache/daffodil/core/grammar/primitives/PrimitivesDelimiters.scala:
##########
@@ -68,15 +69,33 @@ abstract class DelimiterText(
     case _ => false
   }
 
+  // When the value is known to be empty, is this delimiter required to
+  // be present according to the emptyValueDelimiterPolicy
+  private val requiredOnEmptyValue = e match {
+    case elemB: ElementBase => elemB.isDelimiterRequiredWhenEmpty(delimiterType)
+    case _ => true
+  }
+
+  // When the value is not yet known to be empty, can we continue parsing
+  // if we have a missing delimiter
+  private val proceedOnMissingDelimiter = e match {
+    case elemB: ElementBase =>
+      delimiterType == DelimiterTextType.Initiator && elemB.emptyValueDelimiterPolicy == EmptyValueDelimiterPolicy.Terminator

Review Comment:
   What about the `none` policy? Doesn't that mean the initiator is optional and we can proceed if it's missing?



##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/runtime1/processors/parsers/DelimiterParsers.scala:
##########
@@ -108,12 +109,14 @@ class DelimiterTextParser(
     if (foundDelimiter.isDefined) {
       if (!containsLocalMatch(foundDelimiter.get.matchedDFAs, start)) {
         // It was a remote delimiter but we should have found a local one.
-        PE(
-          start,
-          "Found out of scope delimiter: '%s' '%s'",
-          foundDelimiter.get.matchedDFAs(0).lookingFor,
-          Misc.remapStringToVisibleGlyphs(foundDelimiter.get.matchedDelimiterValue.get),
-        )
+        if (requiredOnEmptyValue) {

Review Comment:
   I'm not sure I understand this. We found a delimiter, but not the one we were looking for. Seems like that should always be a PE?



##########
daffodil-runtime1-unparser/src/main/scala/org/apache/daffodil/unparsers/runtime1/DelimiterUnparsers.scala:
##########
@@ -54,6 +55,15 @@ class DelimiterTextUnparser(
       s"Unparsing starting at bit position: ${state.dataOutputStream.maybeAbsBitPos0b}",
     )
 
+    // Do not emit the delimiter if the value is empty and the emptyValueDelimiterPolicy
+    // indicates it should be suppressed
+    if (state.currentInfosetNode.isSimple) {
+      val node = state.currentInfosetNode.asSimple
+      if (node.hasValue && node.isEmpty && !requiredOnEmptyValue) {
+        return

Review Comment:
   We usually try to avoid return. I think the only way around it is to wrap everyhing in an if-block, which could be messy. So maybe it's fine in this case? Thoughts from others?



##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/runtime1/processors/parsers/DelimiterParsers.scala:
##########
@@ -145,8 +147,8 @@ class DelimiterTextParser(
         localTypedDelims = localTypedDelims + " " + scannedDelims.next().lookingFor
       }
 
-      PE(start, "%s '%s' not found", delimiterType.toString, localTypedDelims)
-      return
+      if (!proceedOnMissingDelimiter)
+        PE(start, "%s '%s' not found", delimiterType.toString, localTypedDelims)

Review Comment:
   Where do we require that the value was empty? It seems like this would allow a delimiter to be missing when there is a non-empty value?



##########
daffodil-runtime1-unparser/src/main/scala/org/apache/daffodil/unparsers/runtime1/DelimiterUnparsers.scala:
##########
@@ -54,6 +55,15 @@ class DelimiterTextUnparser(
       s"Unparsing starting at bit position: ${state.dataOutputStream.maybeAbsBitPos0b}",
     )
 
+    // Do not emit the delimiter if the value is empty and the emptyValueDelimiterPolicy
+    // indicates it should be suppressed
+    if (state.currentInfosetNode.isSimple) {

Review Comment:
   Does emptyValueDelmiterPolicy not apply on complex elements? Can't complex elements be empty as well?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [daffodil] mike-mcgann commented on pull request #981: Update parsing when emptyValueDelimiterPolicy is 'none'

Posted by "mike-mcgann (via GitHub)" <gi...@apache.org>.
mike-mcgann commented on PR #981:
URL: https://github.com/apache/daffodil/pull/981#issuecomment-1460814566

   > My first insinct is that maybe we need a new combinator? So a new parser that accepts three sub parsers (initiator, content, and terminator). It parses each of these allowing some to fail depending on the `emptyValueDelimiterPolicy` and the whether the parsed content was empty.
   
   This makes more sense because it was getting a bit confusing trying to figure out how to make these changes with the current structure. I'll take a look and implementing it this way but I may have a few questions on how to do this properly. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [daffodil] mbeckerle commented on a diff in pull request #981: Update parsing when emptyValueDelimiterPolicy is 'none'

Posted by "mbeckerle (via GitHub)" <gi...@apache.org>.
mbeckerle commented on code in PR #981:
URL: https://github.com/apache/daffodil/pull/981#discussion_r1129899539


##########
daffodil-runtime1-unparser/src/main/scala/org/apache/daffodil/unparsers/runtime1/DelimiterUnparsers.scala:
##########
@@ -54,6 +55,15 @@ class DelimiterTextUnparser(
       s"Unparsing starting at bit position: ${state.dataOutputStream.maybeAbsBitPos0b}",
     )
 
+    // Do not emit the delimiter if the value is empty and the emptyValueDelimiterPolicy
+    // indicates it should be suppressed
+    if (state.currentInfosetNode.isSimple) {
+      val node = state.currentInfosetNode.asSimple
+      if (node.hasValue && node.isEmpty && !requiredOnEmptyValue) {
+        return

Review Comment:
   Return is "maintenance unstable" in scala.  You modify the code at some point and an exception gets involved and then the return ends up causing real problems. 
   
   Best to remove them. They don't compose well with exception throw/catch/try or Try objects. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [daffodil] stevedlawrence commented on pull request #981: Update parsing when emptyValueDelimiterPolicy is 'none'

Posted by "stevedlawrence (via GitHub)" <gi...@apache.org>.
stevedlawrence commented on PR #981:
URL: https://github.com/apache/daffodil/pull/981#issuecomment-1460893711

   I wonder if we should just say that `emptyValueDelimiterPolicy` must be `both` (I think our current behavior) and plan to implement it in a future release? Implementing this propery could be considered implementing a new DFDL feature, and the focus for 3.5.0 is bug fixes. We also don't have any formats that needing this property.
   
   We may want to change DAFFODIL-2205 so that it isn't so much about fixing value with "none", but to just implement `emptyValueDelimiterPolicy` values other than "both"?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [daffodil] tuxji commented on a diff in pull request #981: Update parsing when emptyValueDelimiterPolicy is 'none'

Posted by "tuxji (via GitHub)" <gi...@apache.org>.
tuxji commented on code in PR #981:
URL: https://github.com/apache/daffodil/pull/981#discussion_r1129895637


##########
daffodil-runtime1-unparser/src/main/scala/org/apache/daffodil/unparsers/runtime1/DelimiterUnparsers.scala:
##########
@@ -54,6 +55,15 @@ class DelimiterTextUnparser(
       s"Unparsing starting at bit position: ${state.dataOutputStream.maybeAbsBitPos0b}",
     )
 
+    // Do not emit the delimiter if the value is empty and the emptyValueDelimiterPolicy
+    // indicates it should be suppressed
+    if (state.currentInfosetNode.isSimple) {
+      val node = state.currentInfosetNode.asSimple
+      if (node.hasValue && node.isEmpty && !requiredOnEmptyValue) {
+        return

Review Comment:
   Good point.  The whole function is about 60 lines long and the return occurs 10 lines into the function's body.  The rest of the function would have to be wrapped in an if-block, or the function refactored to call sub-functions which do the rest of the work.  I'm fine with the return if others agree.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [daffodil] mike-mcgann commented on pull request #981: Update parsing when emptyValueDelimiterPolicy is 'none'

Posted by "mike-mcgann (via GitHub)" <gi...@apache.org>.
mike-mcgann commented on PR #981:
URL: https://github.com/apache/daffodil/pull/981#issuecomment-1460749045

   > I'm not sure how we currently implement things though, or maybe those other policies are broken too?
   
   I started with the original ticket, placed that test case in (in the form that I thought it wanted), verified that the error was the one described in the ticket, and worked from there trying to get the test to pass. I didn't find an obvious place in the code where these other cases were being handled so I focused on getting the 'none' case to work first. Once that was working, I then suspected that the other cases may not be handled since I didn't run across them. I added those additional cases as tests and some of them didn't work. I then made some additional changes to get those to pass. 
   
   I just pulled the latest from main and copied over the new tests there to see what the results would be. Only the 'both' test passed--the 'initiator', 'terminator', and 'none' tests failed. So I don't think the emptyValueDelimiterPolicy got put in or in a way that was exercised by the tests. 
   
   Also, I don't have a grasp on the details of the parsing at the moment so I'm making most of my changes through trial and error and seeing what that fixes and what that breaks. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [daffodil] mike-mcgann commented on a diff in pull request #981: Update parsing when emptyValueDelimiterPolicy is 'none'

Posted by "mike-mcgann (via GitHub)" <gi...@apache.org>.
mike-mcgann commented on code in PR #981:
URL: https://github.com/apache/daffodil/pull/981#discussion_r1129961689


##########
daffodil-core/src/main/scala/org/apache/daffodil/core/dsom/LocalElementMixin.scala:
##########
@@ -84,6 +85,10 @@ trait LocalElementMixin extends ParticleMixin with LocalElementGrammarMixin {
       else if (
         isDefaultable && !hasTerminator && emptyValueDelimiterPolicy =:= EmptyValueDelimiterPolicy.Terminator
       ) true
+      else if (
+        hasInitiator && hasTerminator && emptyValueDelimiterPolicy =:= EmptyValueDelimiterPolicy.None

Review Comment:
   This was the only place in the code where I found something working with the emptyValueDelimiterPolicy directly. The negated value of this function gets set as "hasKnownRequiredSyntax" so I put this change in to see what would happen. This was to express the case that if there is both a initiator and a terminator but the policy is none, then it does not have a known required syntax. This change ended up having no effect and as I was tracing through the code, it didn't seem to be used in a way related to delimiter handling. I ended up looking elsewhere but forgot to remove this from the code. I just removed it and the tests still pass so it still had no effect. 
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [daffodil] stevedlawrence commented on pull request #981: Update parsing when emptyValueDelimiterPolicy is 'none'

Posted by "stevedlawrence (via GitHub)" <gi...@apache.org>.
stevedlawrence commented on PR #981:
URL: https://github.com/apache/daffodil/pull/981#issuecomment-1460762916

   Sounds like maybe `emptyValueDelimiterPolicy` is just completely broken/not implemented then? We might want to consider how to best implement each of these instead of doing one at a time, there might be a very clean solution.
   
   My first insinct is that maybe we need a new combinator? So a new parser that accepts three sub parsers (initiator, content, and terminator). It parses each of these allowing some to fail depending on the `emptyValueDelimiterPolicy` and the whether the parsed content was empty.
   
   And the Unparse combinator would look at the value of content, and if empty then it unparses the initiator/terminator based on the policy.
   
   This way delimiters don't need to know if they should be optional or not, and they don't have to worry about the content being empty or not. They just do what they do, and if they fail, the combinator that controls them handles it accordingly.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [daffodil] mike-mcgann closed pull request #981: Update parsing when emptyValueDelimiterPolicy is 'none'

Posted by "mike-mcgann (via GitHub)" <gi...@apache.org>.
mike-mcgann closed pull request #981: Update parsing when emptyValueDelimiterPolicy is 'none'
URL: https://github.com/apache/daffodil/pull/981


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org