You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@daffodil.apache.org by "mike-mcgann (via GitHub)" <gi...@apache.org> on 2023/02/13 21:08:35 UTC

[GitHub] [daffodil] mike-mcgann opened a new pull request, #962: Validate occurrences on rendered elements

mike-mcgann opened a new pull request, #962:
URL: https://github.com/apache/daffodil/pull/962

   When processing an array Daffodil tracks the current index and it uses that final value to validate the number of occurrences found. If the emptyElementParsePolicy is set to absent, the index may not reflect the true size of the array. This change adds a separate counter to only increment when parsing results in a normal, non-absent, value.
   
   [DAFFODIL-2364](https://issues.apache.org/jira/browse/DAFFODIL-2364)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [daffodil] tuxji commented on a diff in pull request #962: Validate occurrences on rendered elements

Posted by "tuxji (via GitHub)" <gi...@apache.org>.

tuxji commented on code in PR #962:
URL: https://github.com/apache/daffodil/pull/962#discussion_r1105061269


##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/runtime1/processors/parsers/SequenceChildBases.scala:
##########
@@ -339,7 +340,8 @@ abstract class RepeatingChildParser(
    * by way of index: e.g., fn:exists( optElement[dfdl:currentIndex()]  )
    */
   def endArray(state: PState): Unit = {
-    val actualOccurs = state.mpstate.arrayIndexStack.pop()
+    state.mpstate.arrayIndexStack.pop()
+    val actualOccurs = state.mpstate.arrayCountStack.pop()
     super.endArray(state, actualOccurs)

Review Comment:
   Let's use the same variable name both here where we call `endArray` and down there where we define `endArray`'s parameter.  I'm fine with keeping `actualOccurs` if you want to use that name consistently in both places, or use the new name consistently in both places as well.



##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/runtime1/processors/parsers/SequenceChildBases.scala:
##########
@@ -509,7 +511,7 @@ trait EndArrayChecksMixin {
 
   def erd: ElementRuntimeData
 
-  def endArray(state: ParseOrUnparseState, actualOccurs: Long): Unit = {
+  def endArray(state: ParseOrUnparseState, occurrence: Long): Unit = {

Review Comment:
   I'm fine with keeping or renaming `actualOccurs`, but the new name should be plural as well (`occurrences` or `numOccurrences`, which isn't all that different from `actualOccurs` actually).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [daffodil] mike-mcgann commented on a diff in pull request #962: Validate occurrences on rendered elements

Posted by "mike-mcgann (via GitHub)" <gi...@apache.org>.

mike-mcgann commented on code in PR #962:
URL: https://github.com/apache/daffodil/pull/962#discussion_r1107267826


##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/runtime1/processors/parsers/SequenceChildBases.scala:
##########
@@ -519,8 +519,13 @@ trait EndArrayChecksMixin {
         val minO = erd.minOccurs
         val maxO = erd.maxOccurs
         val isUnbounded = maxO == -1
-        val occurrence = actualOccurs - 1
-
+        val occurrence = {
+          val maybeLastChild = state.infoset.maybeLastChild

Review Comment:
   Added



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [daffodil] mbeckerle commented on a diff in pull request #962: Validate occurrences on rendered elements

Posted by "mbeckerle (via GitHub)" <gi...@apache.org>.

mbeckerle commented on code in PR #962:
URL: https://github.com/apache/daffodil/pull/962#discussion_r1106150748


##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/runtime1/processors/parsers/PState.scala:
##########
@@ -107,8 +107,11 @@ object MPState {
 class MPState private () {
 
   val arrayIndexStack = MStackOfLong()
+  val arrayCountStack = MStackOfLong()

Review Comment:
   Since DIArray already has a member that contains the count, yes I would think we should just use that. 
   
   Changes to the array will already get backtracked properly. 
   
   If dfdl:occursIndex() is returning the iteration number, then that's going to be a bug for sure in these sparse array cases where attempts to parse an element succeed but create an empty value that causes the element to not be attached to the infoset.  Separate bug though. We need a test to illustrate it. 
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [daffodil] mike-mcgann commented on a diff in pull request #962: Validate occurrences on rendered elements

Posted by "mike-mcgann (via GitHub)" <gi...@apache.org>.

mike-mcgann commented on code in PR #962:
URL: https://github.com/apache/daffodil/pull/962#discussion_r1107220096


##########
daffodil-test/src/test/scala/org/apache/daffodil/section05/facets/TestNulChars.scala:
##########
@@ -35,6 +35,5 @@ class TestNulChars {
   // DAFFODIL-2363 &#xE000; (NUL replacement into XML) can't be used in pattern facet. With full validation.
   @Test def test_nulPattern1() = { runner.runOneTest("nulPattern1") }
 
-  // DAFFODIL-2364 - infinite loop
-  // @Test def test_nulPad1() = { runner.runOneTest("nulPad1") }
+  @Test def test_nulPad1() = { runner.runOneTest("nulPad1") }

Review Comment:
   Added test for this input



##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/runtime1/processors/parsers/SequenceChildBases.scala:
##########
@@ -509,7 +511,7 @@ trait EndArrayChecksMixin {
 
   def erd: ElementRuntimeData
 
-  def endArray(state: ParseOrUnparseState, actualOccurs: Long): Unit = {
+  def endArray(state: ParseOrUnparseState, occurrence: Long): Unit = {

Review Comment:
   These changes have been replaced with counting the children in the array node. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [daffodil] mike-mcgann commented on pull request #962: Validate occurrences on rendered elements

Posted by "mike-mcgann (via GitHub)" <gi...@apache.org>.

mike-mcgann commented on PR #962:
URL: https://github.com/apache/daffodil/pull/962#issuecomment-1431776051

   > I see that the new comment failed the CI's scalafmt check. Why did the check fail; did scalafmt want to rewrap the comment or what?
   
   It was a blank line that it wanted to remove. So I just found out that with IntelliJ, if it automatically saves the file for you, it won't do the formatting. It only applies it if you manually save the file with Control-S. I'm going to see if I can disable the auto-saving as that is confusing me as I switch between IntelliJ and VS Code. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [daffodil] stevedlawrence commented on a diff in pull request #962: Validate occurrences on rendered elements

Posted by "stevedlawrence (via GitHub)" <gi...@apache.org>.

stevedlawrence commented on code in PR #962:
URL: https://github.com/apache/daffodil/pull/962#discussion_r1106269118


##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/runtime1/processors/parsers/PState.scala:
##########
@@ -107,8 +107,11 @@ object MPState {
 class MPState private () {
 
   val arrayIndexStack = MStackOfLong()
+  val arrayCountStack = MStackOfLong()

Review Comment:
   I think this code does something similar to what you need:
   
   https://github.com/apache/daffodil/blob/main/daffodil-runtime1/src/main/scala/org/apache/daffodil/runtime1/processors/parsers/SequenceParserBases.scala#L216-L238
   
   I think `pstate.infoset.maybeLastChild` should return the last `DINode` child of padRoot (i.e. the nz array). And since the `DINode` interface has a `numChildren` method, you might be able to just use that and not even need to check if it's an array. So if the child isn't even an array validation will still work. I think...



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [daffodil] tuxji commented on a diff in pull request #962: Validate occurrences on rendered elements

Posted by "tuxji (via GitHub)" <gi...@apache.org>.

tuxji commented on code in PR #962:
URL: https://github.com/apache/daffodil/pull/962#discussion_r1107459547


##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/runtime1/processors/parsers/SequenceChildBases.scala:
##########
@@ -519,8 +519,13 @@ trait EndArrayChecksMixin {
         val minO = erd.minOccurs
         val maxO = erd.maxOccurs
         val isUnbounded = maxO == -1
-        val occurrence = actualOccurs - 1
-
+        val occurrence = {
+          val maybeLastChild = state.infoset.maybeLastChild

Review Comment:
   I see that the new comment failed the CI's scalafmt check.  Why did the check fail; did scalafmt want to rewrap the comment or what?  I believe there's a way to configure Intellij IDEA to call scalafmt upon save so you can avoid failing the scalafmt check again in the future.  Please bring the [Intellij IDEA Setup](https://cwiki.apache.org/confluence/display/DAFFODIL/IntelliJ+IDEA+Setup) page up to date (what it says about the CLI tests is also out of date) so that developers will know how to avoid failing the check multiple times.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [daffodil] stevedlawrence commented on a diff in pull request #962: Validate occurrences on rendered elements

Posted by "stevedlawrence (via GitHub)" <gi...@apache.org>.

stevedlawrence commented on code in PR #962:
URL: https://github.com/apache/daffodil/pull/962#discussion_r1107248312


##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/runtime1/processors/parsers/SequenceChildBases.scala:
##########
@@ -519,8 +519,13 @@ trait EndArrayChecksMixin {
         val minO = erd.minOccurs
         val maxO = erd.maxOccurs
         val isUnbounded = maxO == -1
-        val occurrence = actualOccurs - 1
-
+        val occurrence = {
+          val maybeLastChild = state.infoset.maybeLastChild

Review Comment:
   I think it's worth adding a comment explaining why we have this logic for figuring out the number of occurrences of this array, since the obvious logic of something like `pstate.infoset.numChildren` is actually wrong. Maybe something like this?
   
   > at this point, state.infoset is the parent node of the DIArray that we are trying to validate. If any elements were added for this array, then the DIArray will be the last child of that parent node AND it will have the same erd as this parser. If that's not the case, and the parent has no children or it does but the last child has a different erd than what we expect, then it must mean nothing was added for this array and the occurrences are zero.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [daffodil] stevedlawrence commented on a diff in pull request #962: Validate occurrences on rendered elements

Posted by "stevedlawrence (via GitHub)" <gi...@apache.org>.

stevedlawrence commented on code in PR #962:
URL: https://github.com/apache/daffodil/pull/962#discussion_r1105024811


##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/runtime1/processors/parsers/SequenceParserBases.scala:
##########
@@ -186,6 +186,9 @@ abstract class SequenceParserBase(
               if (ais ne Done) {
                 pstate.mpstate.moveOverOneArrayIndexOnly()
               }
+              if (resultOfTry == ParseAttemptStatus.NormalRep) {
+                pstate.mpstate.incrementArrayCount()
+              }

Review Comment:
   It's not clear to me why we need this separate array counter. Does something rely on the arrayIndex to be incremented even when resultOfTry is not NormalRep?



##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/runtime1/processors/parsers/PState.scala:
##########
@@ -107,8 +107,11 @@ object MPState {
 class MPState private () {
 
   val arrayIndexStack = MStackOfLong()
+  val arrayCountStack = MStackOfLong()

Review Comment:
   I'm hoping we don't need this arrayCountStack (see other comment), but if we do, then I think we need to modify the `Mark` class in the `MPState` object at the top of this file to be able to capture and reset this state in case backtracking occurs.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [daffodil] stevedlawrence commented on a diff in pull request #962: Validate occurrences on rendered elements

Posted by "stevedlawrence (via GitHub)" <gi...@apache.org>.

stevedlawrence commented on code in PR #962:
URL: https://github.com/apache/daffodil/pull/962#discussion_r1105785855


##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/runtime1/processors/parsers/PState.scala:
##########
@@ -107,8 +107,11 @@ object MPState {
 class MPState private () {
 
   val arrayIndexStack = MStackOfLong()
+  val arrayCountStack = MStackOfLong()

Review Comment:
   Ok, just to be to clear, we could have something like `dfdl:occursCountKind="implicit"`, `minOccurs="5"`, `maxOccurs="5"`, and with the right combination of other settings, we know daffodil will parse 5 things, but some of those could be "empty", and so we could end up with less than 5 things in the infoset? Very confusing, but good to know.
   
   Followup question, do we even need a second counter? In the cases where we need to know the actual number of occurrences in an infoset, can we just count the number of occurrences in the infoset? The `DIArray` class has a `length` member with this information. Can we just use that?
   
   We also probably need to take a good look at `arrayIndexStack` and when it's used. I'm concerned that we make assumptions in a number of places that this is the number of array elements, and that's not always the case. This is for sure the case with `dfdl:occursIndex()` function, which uses arrayPos which I think is just the top of the stack. Maybe `arrayIndexStack` really wants to be renamed to something like `arrayIterationStack` or something, and have some thorough comments about what it is and why it's not necessarily the same as the number of array elements in the infoset.  Fine with me if this is a new issue/PR.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [daffodil] mike-mcgann commented on a diff in pull request #962: Validate occurrences on rendered elements

Posted by "mike-mcgann (via GitHub)" <gi...@apache.org>.

mike-mcgann commented on code in PR #962:
URL: https://github.com/apache/daffodil/pull/962#discussion_r1106237438


##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/runtime1/processors/parsers/PState.scala:
##########
@@ -107,8 +107,11 @@ object MPState {
 class MPState private () {
 
   val arrayIndexStack = MStackOfLong()
+  val arrayCountStack = MStackOfLong()

Review Comment:
   > In the cases where we need to know the actual number of occurrences in an infoset, can we just count the number of occurrences in the infoset? The `DIArray` class has a `length` member with this information. Can we just use that?
   
   I tried to find an easy way from within the `endArray` function to get a hold of that but was unable to find one. The current node at this point is the containing sequence itself (padRoot, not nz). What is the best way to get a reference to the `DIArray` here? 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [daffodil] mike-mcgann commented on a diff in pull request #962: Validate occurrences on rendered elements

Posted by "mike-mcgann (via GitHub)" <gi...@apache.org>.

mike-mcgann commented on code in PR #962:
URL: https://github.com/apache/daffodil/pull/962#discussion_r1106318024


##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/runtime1/processors/parsers/PState.scala:
##########
@@ -107,8 +107,11 @@ object MPState {
 class MPState private () {
 
   val arrayIndexStack = MStackOfLong()
+  val arrayCountStack = MStackOfLong()

Review Comment:
   > I think pstate.infoset.maybeLastChild should return the last DINode child of padRoot (i.e. the nz array). And since the DINode interface has a numChildren method, you might be able to just use that and not even need to check if it's an array. So if the child isn't even an array validation will still work. I think...
   
   Right above the `endArray` call in the above snippet, `maybeLastChild` at that point is `Nope`, and the contents of `pstate.infoset` contains padRoot with no children. I was trying to obtain this information from the `ElementRuntimeData` but that could contain anything inside the sequence--multiple arrays, simple types, null values, etc.  
   
   And is it me, or does the debugger have issues finding variables or are they getting optimized out? At times I have to do some aliasing to get things to appear, such as `val foo = this.foo`
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [daffodil] mike-mcgann commented on a diff in pull request #962: Validate occurrences on rendered elements

Posted by "mike-mcgann (via GitHub)" <gi...@apache.org>.

mike-mcgann commented on code in PR #962:
URL: https://github.com/apache/daffodil/pull/962#discussion_r1107219420


##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/runtime1/processors/parsers/SequenceParserBases.scala:
##########
@@ -186,6 +186,9 @@ abstract class SequenceParserBase(
               if (ais ne Done) {
                 pstate.mpstate.moveOverOneArrayIndexOnly()
               }
+              if (resultOfTry == ParseAttemptStatus.NormalRep) {

Review Comment:
   These changes have been replaced with counting the children in the array node. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [daffodil] stevedlawrence commented on a diff in pull request #962: Validate occurrences on rendered elements

Posted by "stevedlawrence (via GitHub)" <gi...@apache.org>.

stevedlawrence commented on code in PR #962:
URL: https://github.com/apache/daffodil/pull/962#discussion_r1106210081


##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/runtime1/processors/parsers/PState.scala:
##########
@@ -107,8 +107,11 @@ object MPState {
 class MPState private () {
 
   val arrayIndexStack = MStackOfLong()
+  val arrayCountStack = MStackOfLong()

Review Comment:
   I wrote up a quick test and can confirm that `dfdlx:occursIndex()` returns the iteration number and not the number of elements in the infoset. I'll open a ticket and submit a PR to add the test shortly.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [daffodil] stevedlawrence commented on a diff in pull request #962: Validate occurrences on rendered elements

Posted by "stevedlawrence (via GitHub)" <gi...@apache.org>.

stevedlawrence commented on code in PR #962:
URL: https://github.com/apache/daffodil/pull/962#discussion_r1105060963


##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/runtime1/processors/parsers/PState.scala:
##########
@@ -107,8 +107,11 @@ object MPState {
 class MPState private () {
 
   val arrayIndexStack = MStackOfLong()
+  val arrayCountStack = MStackOfLong()

Review Comment:
   >  maxOccurs 
   
   Isn't maxOccurs just about what appears in the infoset?
   
   > defaulting
   
   If something is defaulted, doesn't that mean it will have an instance in the infoset and so would be counted in arrayIndex?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [daffodil] mbeckerle commented on a diff in pull request #962: Validate occurrences on rendered elements

Posted by "mbeckerle (via GitHub)" <gi...@apache.org>.

mbeckerle commented on code in PR #962:
URL: https://github.com/apache/daffodil/pull/962#discussion_r1105051465


##########
daffodil-test/src/test/scala/org/apache/daffodil/section05/facets/TestNulChars.scala:
##########
@@ -35,6 +35,5 @@ class TestNulChars {
   // DAFFODIL-2363 &#xE000; (NUL replacement into XML) can't be used in pattern facet. With full validation.
   @Test def test_nulPattern1() = { runner.runOneTest("nulPattern1") }
 
-  // DAFFODIL-2364 - infinite loop
-  // @Test def test_nulPad1() = { runner.runOneTest("nulPad1") }
+  @Test def test_nulPad1() = { runner.runOneTest("nulPad1") }

Review Comment:
   Unless there already is one, I suggest we need another test which is the real "sparse-array" situation:
   I.e., the input data would be ```00 00 31 00 00 32 00 00 33 00 00``` and the array will have 3 children, the absent ones because of the nul chars would be Absent and so not added to the infoset. 



##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/runtime1/processors/parsers/SequenceParserBases.scala:
##########
@@ -186,6 +186,9 @@ abstract class SequenceParserBase(
               if (ais ne Done) {
                 pstate.mpstate.moveOverOneArrayIndexOnly()
               }
+              if (resultOfTry == ParseAttemptStatus.NormalRep) {

Review Comment:
   This is one of very few cases when I'd actually prefer the code to use the negative and say:
   ```
   resultOfTry != ParseAttemptStatus.AbsentRep
   ```
   Then you can add a comment explaining that if the rep is AbsentRep, then no element is added to the array, but in every other case we do add an element to the array, and so we count it. 



##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/runtime1/processors/parsers/PState.scala:
##########
@@ -107,8 +107,11 @@ object MPState {
 class MPState private () {
 
   val arrayIndexStack = MStackOfLong()
+  val arrayCountStack = MStackOfLong()

Review Comment:
   Scaladoc this element to explain exactly why it is needed, what it is counting, etc. 



##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/runtime1/processors/parsers/PState.scala:
##########
@@ -107,8 +107,11 @@ object MPState {
 class MPState private () {
 
   val arrayIndexStack = MStackOfLong()
+  val arrayCountStack = MStackOfLong()

Review Comment:
   I do see a need for this counter, as the array index is needed to keep track of where in the iteration space we are, for maxOccurs handling, defaulting, etc. 
   The count handles the fact that despite all that iteration, one might not actually be attaching anything to the infoset.
   
   So I do think you will need to add the arrayCountStack to the save/restore state stuff for backtracking. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [daffodil] mike-mcgann commented on a diff in pull request #962: Validate occurrences on rendered elements

Posted by "mike-mcgann (via GitHub)" <gi...@apache.org>.

mike-mcgann commented on code in PR #962:
URL: https://github.com/apache/daffodil/pull/962#discussion_r1107218116


##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/runtime1/processors/parsers/PState.scala:
##########
@@ -107,8 +107,11 @@ object MPState {
 class MPState private () {
 
   val arrayIndexStack = MStackOfLong()
+  val arrayCountStack = MStackOfLong()

Review Comment:
   Updated to count the children of maybeLastChild



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [daffodil] tuxji merged pull request #962: Validate occurrences on rendered elements

Posted by "tuxji (via GitHub)" <gi...@apache.org>.

tuxji merged PR #962:
URL: https://github.com/apache/daffodil/pull/962


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [daffodil] stevedlawrence commented on a diff in pull request #962: Validate occurrences on rendered elements

Posted by "stevedlawrence (via GitHub)" <gi...@apache.org>.

stevedlawrence commented on code in PR #962:
URL: https://github.com/apache/daffodil/pull/962#discussion_r1106270842


##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/runtime1/processors/parsers/PState.scala:
##########
@@ -107,8 +107,11 @@ object MPState {
 class MPState private () {
 
   val arrayIndexStack = MStackOfLong()
+  val arrayCountStack = MStackOfLong()

Review Comment:
   Note: DAFFODIL-2791 and PR #964 added to show that `dfdl:ocurrsIndex` (and likely anything else using arrayIndex as a count of infoset items) is probably broken if when `emptyElemenParsePolicy="treatAsMissing"`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [daffodil] mbeckerle commented on a diff in pull request #962: Validate occurrences on rendered elements

Posted by "mbeckerle (via GitHub)" <gi...@apache.org>.

mbeckerle commented on code in PR #962:
URL: https://github.com/apache/daffodil/pull/962#discussion_r1105110359


##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/runtime1/processors/parsers/PState.scala:
##########
@@ -107,8 +107,11 @@ object MPState {
 class MPState private () {
 
   val arrayIndexStack = MStackOfLong()
+  val arrayCountStack = MStackOfLong()

Review Comment:
   maxOccurs bounds what appears in the infoset. 
   
   We need to count the number of iterations of the parsing. Why? because of maxOccurs and the forward progress requirement. If maxOccurs is bounded, then even if we aren't adding anything to the infoset we still iterate maxOccurs times. So we need the counter that has nothing to do with how many infoset items have actually been retained and attached to the infoset. Just so we know when to stop iterating. 
   
   It's only when maxOccurs is unbounded that we insist on forward progress for each parse.  That forward progress can; however, produce the "empty" representation (e.g., for an empty string element that might be something that requires quotation marks. That would still result in nothing in the infoset if treatAsAbsent is the emptyElementParsePolicy setting. So we can still be consuming data (quotation marks) and yet not adding anything to the infoset. So we can make forward progress in the data without creating any infoset elements. 
   
   So it's pretty clear there are two pieces of state here. One is the index/count of how many infoset elements there are. The other is the iteration count going up to maxOccurs only in the case where that bound is being used. 
   
   (btw: this whole thing probably not a good idea in the DFDL spec, and we probably need to change this despite the DFDL specification. Tools like XSAT insist on large values for maxOccurs rather than unbounded, but that changes the behavior of the schema from one that insists on forward progress to one that does not. I think this was not understood when DFDL was specified, that changing from unbounded to a bound would actually change parse behavior on the very first parse of an array element. So maxOccurs=4000000 means even if the parse is stuck in a rut we'll iterate 4000000 times, where if maxOccurs was unbounded we'd stop immediately. )
   
   So I think adding a second counter, which answers the question how-many infoset elements, distinct from the counter that is providing a count up to maxOccurs to stop the iteration, makes sense to me. 
   
   Interestingly, dfdl:occursIndex function, if called, needs to give back, not the iteration counter, but the index of the current "real" infoset element being attached. 
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [daffodil] stevedlawrence commented on a diff in pull request #962: Validate occurrences on rendered elements

Posted by "stevedlawrence (via GitHub)" <gi...@apache.org>.

stevedlawrence commented on code in PR #962:
URL: https://github.com/apache/daffodil/pull/962#discussion_r1106340070


##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/runtime1/processors/parsers/PState.scala:
##########
@@ -107,8 +107,11 @@ object MPState {
 class MPState private () {
 
   val arrayIndexStack = MStackOfLong()
+  val arrayCountStack = MStackOfLong()

Review Comment:
   I think that might be correct? All the children of padRoot have been removed in this particular test because they were all empty.  So if `infoset.maybeLastChild` is `Nope`, then the number of occurrences of the array we were parsing is zero. If `infoset.maybeLastChild` is defined, then it should be a `DIArray` instance and numChildren should get the length of that array.
   
   But I think maybe that raises a potential edge case--if there were previous siblings to the array, but the array itself had no occurrences, then `maybeLastChild` would return some DINode other than the array we are trying to validate. So if `maybeLastChild` is defined, then we need to make sure it's the thing we actually care about. Maybe the logic looks something like this:
   
   ```scala
   val maybeLasChild = infoset.maybeLastChild
   val arrayOccurrences =
     if (maybeLastChild.isEmpty || maybeLastChild.get.erd != erd)
       // no occurrences for the array this parser cares about were added
       0
     else 
       maybeLastChild.get.numChildren
   ```
   
   I don't use the Scala debugger, so unfortunately I can' help with that.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org