You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@daffodil.apache.org by GitBox <gi...@apache.org> on 2022/12/23 02:24:57 UTC

[GitHub] [daffodil] mbeckerle opened a new pull request, #900: Draft: Work in progress - Implement textNumberPattern 'V' virtual decimal point.

mbeckerle opened a new pull request, #900:
URL: https://github.com/apache/daffodil/pull/900

   Compiles. Tests run assuring nothing is broken.
   
   No test uses the 'V' pattern symbol as yet.
   
   That's the next step.
   
   DAFFODIL-853


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [daffodil] mbeckerle commented on a diff in pull request #900: Implement textNumberPattern 'V' virtual decimal point.

Posted by GitBox <gi...@apache.org>.
mbeckerle commented on code in PR #900:
URL: https://github.com/apache/daffodil/pull/900#discussion_r1059555348


##########
daffodil-runtime1-unparser/src/main/scala/org/apache/daffodil/processors/unparsers/ConvertTextNumberUnparser.scala:
##########
@@ -43,16 +47,18 @@ case class ConvertTextCombinatorUnparser(
 case class ConvertTextNumberUnparser(

Review Comment:
   rename to ConvertTextStandardNumberUnparser. Rename file to correspond. 



##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/processors/parsers/PrimitivesTextNumber1.scala:
##########
@@ -54,11 +56,63 @@ case class ConvertTextCombinatorParser(
   }
 }
 
+trait TextDecimalVirtualPointMixin {
+  def textDecimalVirtualPoint: Int
+
+  final protected val virtualPointScaleFactor = scala.math.pow(10.0, textDecimalVirtualPoint)
+
+  final protected def applyTextDecimalVirtualPointForParse(num1: JNumber): JNumber = {
+    if (textDecimalVirtualPoint == 0) num1
+    else {
+      // scale for virtual decimal point
+      val scaledNum: JNumber = num1 match {
+        // Empirically, in our test suite, we do get Long back here, so the runtime sometimes represents small integer
+        // (or possibly even smaller decimal numbers with no fraction part) as Long.
+        case l: JLong => JBigDecimal.valueOf(l).scaleByPowerOfTen(-textDecimalVirtualPoint)
+        case bd: JBigDecimal => bd.scaleByPowerOfTen(-textDecimalVirtualPoint)
+        case f: JFloat => (f / virtualPointScaleFactor).floatValue()
+        case d: JDouble => d / virtualPointScaleFactor
+        // $COVERAGE-OFF$
+        case _ => badType(num1)
+        // $COVERAGE-ON$
+      }
+      scaledNum
+    }
+  }
+
+  // $COVERAGE-OFF$
+  private def badType(num1: AnyRef) = {
+    Assert.invariantFailed(
+      s"""Number cannot be scaled for virtual decimal point,
+         |because it is not a decimal, float, or double.
+         |The type is ${num1.getClass.getSimpleName}.""".stripMargin)
+  }
+  // $COVERAGE-ON$
+
+  final protected def applyTextDecimalVirtualPointForUnparse(valueAsAnyRef: AnyRef) : JNumber = {
+    valueAsAnyRef match {
+      // This is not perfectly symmetrical with the parse side equivalent.
+      // Empirically in our test suite, we do not see JLong here.
+      case f: JFloat => f * virtualPointScaleFactor
+      case d: JDouble => d * virtualPointScaleFactor
+      case bd: JBigDecimal => bd.scaleByPowerOfTen(textDecimalVirtualPoint)
+      case n: JNumber =>
+        if (textDecimalVirtualPoint == 0) n
+        else badType(n)
+      // $COVERAGE-OFF$
+      case _ => Assert.invariantFailed("Not a JNumber")
+      // $COVERAGE-ON$
+    }
+  }
+}
+
 case class ConvertTextNumberParser(

Review Comment:
   Rename to ConvertTextStandardNumberParser for consistency. The file should also be renamed.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [daffodil] stevedlawrence commented on a diff in pull request #900: Implement textNumberPattern 'V' virtual decimal point.

Posted by GitBox <gi...@apache.org>.
stevedlawrence commented on code in PR #900:
URL: https://github.com/apache/daffodil/pull/900#discussion_r1060614906


##########
daffodil-core/src/main/scala/org/apache/daffodil/grammar/primitives/PrimitivesTextNumber.scala:
##########
@@ -46,8 +45,249 @@ case class ConvertTextCombinator(e: ElementBase, value: Gram, converter: Gram)
   override lazy val unparser = new ConvertTextCombinatorUnparser(e.termRuntimeData, value.unparser, converter.unparser)
 }
 
-case class ConvertTextNumberPrim(e: ElementBase)
-  extends Terminal(e, true) {
+// This is a separate object for unit testing purposes.

Review Comment:
   If this is for tests only, should it be in `src/test` somewhere?



##########
daffodil-core/src/main/scala/org/apache/daffodil/grammar/primitives/PrimitivesTextNumber.scala:
##########
@@ -46,8 +45,249 @@ case class ConvertTextCombinator(e: ElementBase, value: Gram, converter: Gram)
   override lazy val unparser = new ConvertTextCombinatorUnparser(e.termRuntimeData, value.unparser, converter.unparser)
 }
 
-case class ConvertTextNumberPrim(e: ElementBase)
-  extends Terminal(e, true) {
+// This is a separate object for unit testing purposes.
+private[primitives]

Review Comment:
   We don't really use package specific private very often. Is there a reason it's used in this case, and should we be using it more often in other cases?



##########
daffodil-core/src/main/scala/org/apache/daffodil/grammar/primitives/PrimitivesTextNumber.scala:
##########
@@ -46,8 +45,249 @@ case class ConvertTextCombinator(e: ElementBase, value: Gram, converter: Gram)
   override lazy val unparser = new ConvertTextCombinatorUnparser(e.termRuntimeData, value.unparser, converter.unparser)
 }
 
-case class ConvertTextNumberPrim(e: ElementBase)
-  extends Terminal(e, true) {
+// This is a separate object for unit testing purposes.
+private[primitives]
+object TextNumberPatternUtils {
+
+  /**
+   * A regex which matches textNumberPatterns that legally use the V
+   * (implied decimal point) character.
+   */
+   private[primitives] lazy val vregexStandard = {
+    //  DFDL v1.0 Spec says
+    //  It is a Schema Definition Error if any symbols other than "0", "1" through "9" or #
+    //  are used in the vpinteger region of the pattern.
+    //
+    // The prefix and suffix chars can surround the vpinteger region, and there can be
+    // a positive and a negative pattern.
+    //
+    // The prefix and suffix cannot be digits, # or P or V. No quoted chars in prefix either.
+    //
+     val prefixChars = """[^0-9#PV']?""" // same for suffix.
+     val beforeVChars = """#*[0-9]+"""
+     val afterVChars = """[0-9]+"""
+     //
+     // Each of the capture groups is defined here in order
+     //
+     val posPrefix = s"""($prefixChars)"""
+     val posBeforeV = s"""($beforeVChars)"""
+     val posAfterV = s"""($afterVChars)"""
+     val posSuffix = posPrefix
+     val negPrefix = posPrefix
+     val negBeforeV = posBeforeV
+     val negAfterV = posAfterV
+     val negSuffix = posPrefix
+     // don't forget the ^ and $ (start of data, end of data) because we want
+     // the match to consume all the characters, starting at the beginning.
+    val vPattern=
+     s"""^${posPrefix}${posBeforeV}V${posAfterV}${posSuffix}(?:;${negPrefix}${negBeforeV}V${negAfterV}${negSuffix})?$$"""
+    val re = vPattern.r
+    re
+  }
+
+
+
+  private[primitives] lazy val vregexZoned= {
+    // Note: for zoned, can only have a positive pattern
+    // Prefix or suffix can only be '+' character, to
+    // indicate leading or trailing sign, and can only
+    // one of those.
+    //
+    // Also we're not allowing the # character, since I think that
+    // makes no sense for zoned with virtual decimal point.
+    //
+    val prefixChars = """\+?""" // only + for prefix/suffix
+    val aroundVChars = """[0-9]+"""
+    //
+    // capture groups are defined here in sequence
+    //
+    val prefix = s"""($prefixChars)"""
+    val beforeV = s"""($aroundVChars)"""
+    val afterV = beforeV
+    val suffix = prefix
+    val vPattern = s"""^${prefix}${beforeV}V${afterV}${suffix}$$""" // only positive pattern allowed
+    val re = vPattern.r
+    re
+  }
+
+  /**
+   * Checks a pattern for suitability with V (virtual decimal point) in the pattern
+   * in the context of zoned textNumberRep.
+   *
+   * This is broken out separately for unit testing purposes.
+   *
+   * @param patternStripped the dfdl:textNumberPattern pattern - with all quoting removed.
+   * @return None if the pattern is illegal syntax for use with V (virtual decimal point)
+   *         otherwise Some(N) where N is the number of digits to the right of the V character.
+   */
+  private[primitives] def textDecimalVirtualPointForZoned(patternStripped: String): Option[Int] = {
+    val r = TextNumberPatternUtils.vregexZoned
+    r.findFirstMatchIn(patternStripped) match {
+      // note: cannot have both a prefix and suffix. Only one of them.
+      case Some(r(pre, _, afterV, suf)) if (pre.length + suf.length <= 1) => Some(afterV.length)
+      case _ => None
+    }
+  }
+
+  /**
+   * The apos character (') is the quoting character in ICU patterns (and DFDL textNumberPattern).
+   * This removes any unquoted P or V characters (which are not implemented by ICU)
+   * leaving a pattern string that is suitable for use with ICU.
+   * @param pattern the textNumberPattern string
+   * @return the pattern string with all unquoted P and unquoted V removed.
+   */
+  def removeUnquotedPV(pattern: String) : String = {
+    val uniqueQQString = "alsdflslkjskjslkkjlkkjfppooiipsldsflj"

Review Comment:
   Can you add a comment about this string, or change it to something more obvious about its purpose? At first I though it was some listing of characters that have special meaning in an ICU number pattern, but finally realized it's just random characters to avoid matching parts of an actual ICU pattern.
   
   Might be another way to do this logic?



##########
daffodil-core/src/main/scala/org/apache/daffodil/grammar/primitives/PrimitivesTextNumber.scala:
##########
@@ -46,8 +45,249 @@ case class ConvertTextCombinator(e: ElementBase, value: Gram, converter: Gram)
   override lazy val unparser = new ConvertTextCombinatorUnparser(e.termRuntimeData, value.unparser, converter.unparser)
 }
 
-case class ConvertTextNumberPrim(e: ElementBase)
-  extends Terminal(e, true) {
+// This is a separate object for unit testing purposes.
+private[primitives]
+object TextNumberPatternUtils {
+
+  /**
+   * A regex which matches textNumberPatterns that legally use the V
+   * (implied decimal point) character.
+   */
+   private[primitives] lazy val vregexStandard = {
+    //  DFDL v1.0 Spec says
+    //  It is a Schema Definition Error if any symbols other than "0", "1" through "9" or #
+    //  are used in the vpinteger region of the pattern.
+    //
+    // The prefix and suffix chars can surround the vpinteger region, and there can be
+    // a positive and a negative pattern.
+    //
+    // The prefix and suffix cannot be digits, # or P or V. No quoted chars in prefix either.
+    //
+     val prefixChars = """[^0-9#PV']?""" // same for suffix.
+     val beforeVChars = """#*[0-9]+"""
+     val afterVChars = """[0-9]+"""
+     //
+     // Each of the capture groups is defined here in order
+     //
+     val posPrefix = s"""($prefixChars)"""
+     val posBeforeV = s"""($beforeVChars)"""
+     val posAfterV = s"""($afterVChars)"""
+     val posSuffix = posPrefix
+     val negPrefix = posPrefix
+     val negBeforeV = posBeforeV
+     val negAfterV = posAfterV
+     val negSuffix = posPrefix
+     // don't forget the ^ and $ (start of data, end of data) because we want
+     // the match to consume all the characters, starting at the beginning.
+    val vPattern=
+     s"""^${posPrefix}${posBeforeV}V${posAfterV}${posSuffix}(?:;${negPrefix}${negBeforeV}V${negAfterV}${negSuffix})?$$"""
+    val re = vPattern.r
+    re
+  }
+
+
+
+  private[primitives] lazy val vregexZoned= {
+    // Note: for zoned, can only have a positive pattern
+    // Prefix or suffix can only be '+' character, to
+    // indicate leading or trailing sign, and can only
+    // one of those.
+    //
+    // Also we're not allowing the # character, since I think that
+    // makes no sense for zoned with virtual decimal point.
+    //
+    val prefixChars = """\+?""" // only + for prefix/suffix
+    val aroundVChars = """[0-9]+"""
+    //
+    // capture groups are defined here in sequence
+    //
+    val prefix = s"""($prefixChars)"""
+    val beforeV = s"""($aroundVChars)"""
+    val afterV = beforeV
+    val suffix = prefix
+    val vPattern = s"""^${prefix}${beforeV}V${afterV}${suffix}$$""" // only positive pattern allowed
+    val re = vPattern.r
+    re
+  }
+
+  /**
+   * Checks a pattern for suitability with V (virtual decimal point) in the pattern
+   * in the context of zoned textNumberRep.
+   *
+   * This is broken out separately for unit testing purposes.

Review Comment:
   I don't think I understand this comment?



##########
daffodil-core/src/main/scala/org/apache/daffodil/grammar/primitives/PrimitivesTextNumber.scala:
##########
@@ -46,8 +45,249 @@ case class ConvertTextCombinator(e: ElementBase, value: Gram, converter: Gram)
   override lazy val unparser = new ConvertTextCombinatorUnparser(e.termRuntimeData, value.unparser, converter.unparser)
 }
 
-case class ConvertTextNumberPrim(e: ElementBase)
-  extends Terminal(e, true) {
+// This is a separate object for unit testing purposes.
+private[primitives]
+object TextNumberPatternUtils {
+
+  /**
+   * A regex which matches textNumberPatterns that legally use the V
+   * (implied decimal point) character.
+   */
+   private[primitives] lazy val vregexStandard = {
+    //  DFDL v1.0 Spec says
+    //  It is a Schema Definition Error if any symbols other than "0", "1" through "9" or #
+    //  are used in the vpinteger region of the pattern.
+    //
+    // The prefix and suffix chars can surround the vpinteger region, and there can be
+    // a positive and a negative pattern.
+    //
+    // The prefix and suffix cannot be digits, # or P or V. No quoted chars in prefix either.
+    //
+     val prefixChars = """[^0-9#PV']?""" // same for suffix.
+     val beforeVChars = """#*[0-9]+"""
+     val afterVChars = """[0-9]+"""
+     //
+     // Each of the capture groups is defined here in order
+     //
+     val posPrefix = s"""($prefixChars)"""
+     val posBeforeV = s"""($beforeVChars)"""
+     val posAfterV = s"""($afterVChars)"""
+     val posSuffix = posPrefix
+     val negPrefix = posPrefix
+     val negBeforeV = posBeforeV
+     val negAfterV = posAfterV
+     val negSuffix = posPrefix
+     // don't forget the ^ and $ (start of data, end of data) because we want
+     // the match to consume all the characters, starting at the beginning.
+    val vPattern=
+     s"""^${posPrefix}${posBeforeV}V${posAfterV}${posSuffix}(?:;${negPrefix}${negBeforeV}V${negAfterV}${negSuffix})?$$"""
+    val re = vPattern.r
+    re
+  }
+
+
+
+  private[primitives] lazy val vregexZoned= {
+    // Note: for zoned, can only have a positive pattern
+    // Prefix or suffix can only be '+' character, to
+    // indicate leading or trailing sign, and can only
+    // one of those.
+    //
+    // Also we're not allowing the # character, since I think that
+    // makes no sense for zoned with virtual decimal point.
+    //
+    val prefixChars = """\+?""" // only + for prefix/suffix
+    val aroundVChars = """[0-9]+"""
+    //
+    // capture groups are defined here in sequence
+    //
+    val prefix = s"""($prefixChars)"""
+    val beforeV = s"""($aroundVChars)"""
+    val afterV = beforeV
+    val suffix = prefix
+    val vPattern = s"""^${prefix}${beforeV}V${afterV}${suffix}$$""" // only positive pattern allowed
+    val re = vPattern.r
+    re
+  }
+
+  /**
+   * Checks a pattern for suitability with V (virtual decimal point) in the pattern
+   * in the context of zoned textNumberRep.
+   *
+   * This is broken out separately for unit testing purposes.
+   *
+   * @param patternStripped the dfdl:textNumberPattern pattern - with all quoting removed.
+   * @return None if the pattern is illegal syntax for use with V (virtual decimal point)
+   *         otherwise Some(N) where N is the number of digits to the right of the V character.
+   */
+  private[primitives] def textDecimalVirtualPointForZoned(patternStripped: String): Option[Int] = {
+    val r = TextNumberPatternUtils.vregexZoned
+    r.findFirstMatchIn(patternStripped) match {
+      // note: cannot have both a prefix and suffix. Only one of them.
+      case Some(r(pre, _, afterV, suf)) if (pre.length + suf.length <= 1) => Some(afterV.length)
+      case _ => None
+    }
+  }
+
+  /**
+   * The apos character (') is the quoting character in ICU patterns (and DFDL textNumberPattern).
+   * This removes any unquoted P or V characters (which are not implemented by ICU)
+   * leaving a pattern string that is suitable for use with ICU.
+   * @param pattern the textNumberPattern string
+   * @return the pattern string with all unquoted P and unquoted V removed.
+   */
+  def removeUnquotedPV(pattern: String) : String = {
+    val uniqueQQString = "alsdflslkjskjslkkjlkkjfppooiipsldsflj"
+    // A single regex that matches an unquoted character
+    // where the quoting char is self quoting requires
+    // a zero-width look behind that matches a potentially
+    // unbounded number of quote chars. That's not allowed.
+    //
+    // Consider ''''''P. We want a regex that matches only the P here
+    // because it is preceded by an even number of quotes.
+    //
+    // I did some tests, and the formulations you think might work
+    // such as from stack-overflow, don't work.
+    // (See test howNotToUseRegexLookBehindWithReplaceAll)
+    //
+    // So we use this brute force technique of replacing all ''
+    // first, so we have only single quotes to deal with.
+    pattern.replaceAll("''", uniqueQQString).
+      replaceAll("(?<!')P", "").
+      replaceAll("(?<!')V", "").

Review Comment:
   What if we have a textNumberPattern like `000'xxP'` (e.g. the `P` is inside quotes, but not immediately after an opening quote)? Will this incorrectly remove that P?
   
   I'm wondering if an alternative algorithm is to split the string into a sequence of either quoted strings or single special ICU chars, filter out anything that is a V or P, and then combine it all back together? It avoids the QQ thing and maybe makes it more clear whats going on?
   
   I did minimal testing, but maybe something like this:
   ```scala
   val regex = "'.*?'|.".r
   regex.findAllIn(pattern)
     .filterNot(_ == "P")
     .filterNot(_ == "V")
     .mkString("")
   ```



##########
daffodil-core/src/main/scala/org/apache/daffodil/grammar/primitives/PrimitivesZoned.scala:
##########
@@ -47,71 +45,67 @@ case class ConvertZonedCombinator(e: ElementBase, value: Gram, converter: Gram)
 }
 
 case class ConvertZonedNumberPrim(e: ElementBase)
-  extends Terminal(e, true) {
-
-  val textNumberFormatEv: TextNumberFormatEv = {
-    val pattern = {
-      val p = e.textNumberPattern
+  extends Terminal(e, true)
+  with ConvertTextNumberMixin {
+
+  final override protected lazy val textDecimalVirtualPointFromPattern: Int = {
+    TextNumberPatternUtils.textDecimalVirtualPointForZoned(patternStripped).getOrElse {

Review Comment:
   In reviewing this, I've forgotten what patternStripped removes. I believe it's just without quotes--is it worth renaming to `patternNoQuotes` so it's clear what's been removed? 



##########
daffodil-core/src/main/scala/org/apache/daffodil/grammar/primitives/PrimitivesTextNumber.scala:
##########
@@ -46,8 +45,249 @@ case class ConvertTextCombinator(e: ElementBase, value: Gram, converter: Gram)
   override lazy val unparser = new ConvertTextCombinatorUnparser(e.termRuntimeData, value.unparser, converter.unparser)
 }
 
-case class ConvertTextNumberPrim(e: ElementBase)
-  extends Terminal(e, true) {
+// This is a separate object for unit testing purposes.
+private[primitives]
+object TextNumberPatternUtils {
+
+  /**
+   * A regex which matches textNumberPatterns that legally use the V
+   * (implied decimal point) character.
+   */
+   private[primitives] lazy val vregexStandard = {
+    //  DFDL v1.0 Spec says
+    //  It is a Schema Definition Error if any symbols other than "0", "1" through "9" or #
+    //  are used in the vpinteger region of the pattern.
+    //
+    // The prefix and suffix chars can surround the vpinteger region, and there can be
+    // a positive and a negative pattern.
+    //
+    // The prefix and suffix cannot be digits, # or P or V. No quoted chars in prefix either.
+    //
+     val prefixChars = """[^0-9#PV']?""" // same for suffix.
+     val beforeVChars = """#*[0-9]+"""
+     val afterVChars = """[0-9]+"""
+     //
+     // Each of the capture groups is defined here in order
+     //
+     val posPrefix = s"""($prefixChars)"""
+     val posBeforeV = s"""($beforeVChars)"""
+     val posAfterV = s"""($afterVChars)"""
+     val posSuffix = posPrefix
+     val negPrefix = posPrefix
+     val negBeforeV = posBeforeV
+     val negAfterV = posAfterV
+     val negSuffix = posPrefix
+     // don't forget the ^ and $ (start of data, end of data) because we want
+     // the match to consume all the characters, starting at the beginning.
+    val vPattern=
+     s"""^${posPrefix}${posBeforeV}V${posAfterV}${posSuffix}(?:;${negPrefix}${negBeforeV}V${negAfterV}${negSuffix})?$$"""
+    val re = vPattern.r
+    re
+  }
+
+
+
+  private[primitives] lazy val vregexZoned= {
+    // Note: for zoned, can only have a positive pattern
+    // Prefix or suffix can only be '+' character, to
+    // indicate leading or trailing sign, and can only
+    // one of those.
+    //
+    // Also we're not allowing the # character, since I think that
+    // makes no sense for zoned with virtual decimal point.
+    //
+    val prefixChars = """\+?""" // only + for prefix/suffix
+    val aroundVChars = """[0-9]+"""
+    //
+    // capture groups are defined here in sequence
+    //
+    val prefix = s"""($prefixChars)"""
+    val beforeV = s"""($aroundVChars)"""
+    val afterV = beforeV
+    val suffix = prefix
+    val vPattern = s"""^${prefix}${beforeV}V${afterV}${suffix}$$""" // only positive pattern allowed
+    val re = vPattern.r
+    re
+  }
+
+  /**
+   * Checks a pattern for suitability with V (virtual decimal point) in the pattern
+   * in the context of zoned textNumberRep.
+   *
+   * This is broken out separately for unit testing purposes.
+   *
+   * @param patternStripped the dfdl:textNumberPattern pattern - with all quoting removed.
+   * @return None if the pattern is illegal syntax for use with V (virtual decimal point)
+   *         otherwise Some(N) where N is the number of digits to the right of the V character.
+   */
+  private[primitives] def textDecimalVirtualPointForZoned(patternStripped: String): Option[Int] = {
+    val r = TextNumberPatternUtils.vregexZoned
+    r.findFirstMatchIn(patternStripped) match {
+      // note: cannot have both a prefix and suffix. Only one of them.
+      case Some(r(pre, _, afterV, suf)) if (pre.length + suf.length <= 1) => Some(afterV.length)
+      case _ => None
+    }
+  }
+
+  /**
+   * The apos character (') is the quoting character in ICU patterns (and DFDL textNumberPattern).
+   * This removes any unquoted P or V characters (which are not implemented by ICU)
+   * leaving a pattern string that is suitable for use with ICU.
+   * @param pattern the textNumberPattern string
+   * @return the pattern string with all unquoted P and unquoted V removed.
+   */
+  def removeUnquotedPV(pattern: String) : String = {
+    val uniqueQQString = "alsdflslkjskjslkkjlkkjfppooiipsldsflj"
+    // A single regex that matches an unquoted character
+    // where the quoting char is self quoting requires
+    // a zero-width look behind that matches a potentially
+    // unbounded number of quote chars. That's not allowed.
+    //
+    // Consider ''''''P. We want a regex that matches only the P here
+    // because it is preceded by an even number of quotes.
+    //
+    // I did some tests, and the formulations you think might work
+    // such as from stack-overflow, don't work.
+    // (See test howNotToUseRegexLookBehindWithReplaceAll)
+    //
+    // So we use this brute force technique of replacing all ''
+    // first, so we have only single quotes to deal with.
+    pattern.replaceAll("''", uniqueQQString).
+      replaceAll("(?<!')P", "").
+      replaceAll("(?<!')V", "").
+      replaceAll(uniqueQQString, "''")
+  }
+}
+
+trait ConvertTextNumberMixin {
+
+  def e: ElementBase
+
+  /**
+   * Convenience. The original value of textNumberPattern
+   */
+  @inline
+  final protected def pattern = e.textNumberPattern
+
+  /**
+   * Checks a pattern for suitability with V (virtual decimal point) in the pattern
+   * in the context of the corresponding textNumberRep. Computes number of digits to right of the V.
+   *
+   * SDE if pattern has a syntax error.
+   *
+   * @return the number of digits to the right of the V character. Must be 1 or greater.
+   *
+   */
+  protected def textDecimalVirtualPointFromPattern: Int
+
+  /**
+   * The pattern with escaped characters removed.
+   * This can be reliably searched for pattern characters,
+   * but cannot be used as an operational pattern given that
+   * potentially lots of things have been removed from it.
+   */
+  final protected lazy val patternStripped = {
+    // note: tick == apos == ' == single quote.
+    // First remove entirely all escaped ticks ie., ''
+    val noEscapedTicksRegex = """''""".r
+    val patternNoEscapedTicks = noEscapedTicksRegex.replaceAllIn(pattern, "")
+    // Next remove all tick-escaped characters entirely
+    val noQuotedRegex = """'[^']+'""".r
+    val patternNoQuoted = noQuotedRegex.replaceAllIn(patternNoEscapedTicks, "")
+    // the remaining string contains only pattern special characters
+    // and regular non-pattern characters
+    patternNoQuoted
+  }
+
+  protected final lazy val hasV = patternStripped.contains("V")
+  protected final lazy val hasP = patternStripped.contains("P")
+
+  /**
+   * analogous to the property dfdl:binaryDecimalVirtualPoint
+   *
+   * Value is 0 if there is no virtual decimal point.
+   * Value is the number of digits to the right of the 'V'
+   */
+  final lazy val textDecimalVirtualPoint: Int = {
+    if (!hasV && !hasP) 0 // no virtual point
+    else {
+      if (hasP) {
+        e.notYetImplemented("textNumberPattern with P symbol")
+      }
+      Assert.invariant(hasV)
+      //
+      // check for things incompatible with "V"
+      //
+      val virtualPoint = textDecimalVirtualPointFromPattern
+      Assert.invariant(virtualPoint >= 1) // if this fails the regex is broken.
+      virtualPoint
+    }
+  }
+
+  //
+  // We need 2 patterns
+  // 1. The original textNumberPattern string from the schema
+  // 2. Same but with the P or V removed.
+  //
+  // We need to use the original textNumberPattern in diagnostic messages
+  // since that's what the schema author wrote and expects to see.
+  // But if the pattern has P or V, then we need ICU at runtime to use the original
+  // but with the P or V removed since the P and V don't actually represent anything in the data.
+  final protected lazy val runtimePattern = {
+    if (hasV || hasP) {
+      val patternWithoutPV = TextNumberPatternUtils.removeUnquotedPV(pattern)
+      patternWithoutPV
+    }
+    else pattern
+  }
+
+  final protected def checkPatternWithICU(e: ElementBase) = {
+    // Load the pattern to make sure it is valid
+    try {
+      if (hasV || hasP) {
+        new DecimalFormat(runtimePattern)
+      } else {
+        new DecimalFormat(pattern)
+      }
+    } catch {
+      case ex: IllegalArgumentException =>
+        if (hasV || hasP) {
+          // we don't know what the diagnostic message will say here
+          // since it is from the ICU library.
+          // That library has only seen the pattern with the P and V
+          // removed. The messages might show that pattern which would
+          // confuse the schema author since that's not the pattern
+          // they wrote.
+          //
+          // So we try to explain....
+          e.SDE(
+            """Invalid textNumberPattern.
+              | The errors are about the pattern with the P (decimal scaling position)
+              | and V (virtual decimal point) characters removed: %s""".stripMargin, ex)
+        } else {
+          e.SDE("Invalid textNumberPattern: %s", ex)
+        }
+    }
+  }
+}
+
+case class ConvertTextStandardNumberPrim(e: ElementBase)
+  extends Terminal(e, true)
+  with ConvertTextNumberMixin {
+
+  final override protected lazy val textDecimalVirtualPointFromPattern: Int = {
+    val r = TextNumberPatternUtils.vregexStandard
+    r.findFirstMatchIn(patternStripped) match {
+      case Some(r(_, _, afterV, _, _, _, _, _)) => afterV.length
+      case None =>
+        e.SDE(
+          s"""The dfdl:textNumberPattern '%s' contains 'V' (virtual decimal point).

Review Comment:
   Why does `None` imply there is a `V` in the pattern? Doesn't that mean the the opposite and that there wasn't a V because the vregex pattern didn't match?



##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/processors/parsers/ConvertTextStandardNumberParser.scala:
##########
@@ -54,11 +56,64 @@ case class ConvertTextCombinatorParser(
   }
 }
 
-case class ConvertTextNumberParser(
+trait TextDecimalVirtualPointMixin {
+  def textDecimalVirtualPoint: Int
+
+  final protected val virtualPointScaleFactor = scala.math.pow(10.0, textDecimalVirtualPoint)

Review Comment:
   Maybe make this lazy? No need to calculate for he JLong and JBigDecimal cases.



##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/processors/parsers/ConvertTextStandardNumberParser.scala:
##########
@@ -54,11 +56,64 @@ case class ConvertTextCombinatorParser(
   }
 }
 
-case class ConvertTextNumberParser(
+trait TextDecimalVirtualPointMixin {
+  def textDecimalVirtualPoint: Int
+
+  final protected val virtualPointScaleFactor = scala.math.pow(10.0, textDecimalVirtualPoint)
+
+  final protected def applyTextDecimalVirtualPointForParse(num1: JNumber): JNumber = {
+    if (textDecimalVirtualPoint == 0) num1
+    else {
+      // scale for virtual decimal point
+      val scaledNum: JNumber = num1 match {
+        // Empirically, in our test suite, we do get Long back here, so the runtime sometimes represents small integer
+        // (or possibly even smaller decimal numbers with no fraction part) as Long.
+        case l: JLong => JBigDecimal.valueOf(l).scaleByPowerOfTen(-textDecimalVirtualPoint)
+        case bd: JBigDecimal => bd.scaleByPowerOfTen(-textDecimalVirtualPoint)
+        case f: JFloat => (f / virtualPointScaleFactor).toFloat
+        case d: JDouble => (d / virtualPointScaleFactor).toDouble
+        // $COVERAGE-OFF$
+        case _ => badType(num1)
+        // $COVERAGE-ON$
+      }
+      scaledNum
+    }
+  }
+
+  // $COVERAGE-OFF$
+  private def badType(num1: AnyRef) = {
+    Assert.invariantFailed(
+      s"""Number cannot be scaled for virtual decimal point,
+         |because it is not a decimal, float, or double.
+         |The type is ${num1.getClass.getSimpleName}.""".stripMargin)
+  }
+  // $COVERAGE-ON$
+
+  final protected def applyTextDecimalVirtualPointForUnparse(valueAsAnyRef: AnyRef) : JNumber = {
+    valueAsAnyRef match {
+      // This is not perfectly symmetrical with the parse side equivalent.
+      // Empirically in our test suite, we do not see JLong here.
+      case f: JFloat => (f * virtualPointScaleFactor).toFloat
+      case d: JDouble => (d * virtualPointScaleFactor).toDouble
+      case bd: JBigDecimal => bd.scaleByPowerOfTen(textDecimalVirtualPoint)
+
+      case n: JNumber =>
+        if (textDecimalVirtualPoint == 0) n
+        // $COVERAGE-OFF$ // both badType and the next case are coverage-off
+        else badType(n)
+      case _ => Assert.invariantFailed("Not a JNumber")
+      // $COVERAGE-ON$
+    }
+  }

Review Comment:
   Does this function need to clamp the result to an integer? Or does ICU handle that when we pass the result of this function to `df.format(...)`?



##########
daffodil-core/src/main/scala/org/apache/daffodil/grammar/primitives/PrimitivesZoned.scala:
##########
@@ -47,71 +45,67 @@ case class ConvertZonedCombinator(e: ElementBase, value: Gram, converter: Gram)
 }
 
 case class ConvertZonedNumberPrim(e: ElementBase)
-  extends Terminal(e, true) {
-
-  val textNumberFormatEv: TextNumberFormatEv = {
-    val pattern = {
-      val p = e.textNumberPattern
+  extends Terminal(e, true)
+  with ConvertTextNumberMixin {
+
+  final override protected lazy val textDecimalVirtualPointFromPattern: Int = {
+    TextNumberPatternUtils.textDecimalVirtualPointForZoned(patternStripped).getOrElse {
+      e.SDE(
+        s"""The dfdl:textNumberPattern '%s' contains 'V' (virtual decimal point).

Review Comment:
   Similar question as above. How do we know there is a V in the pattern?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [daffodil] mbeckerle commented on a diff in pull request #900: Implement textNumberPattern 'V' virtual decimal point.

Posted by GitBox <gi...@apache.org>.
mbeckerle commented on code in PR #900:
URL: https://github.com/apache/daffodil/pull/900#discussion_r1059411322


##########
daffodil-core/src/main/scala/org/apache/daffodil/grammar/primitives/PrimitivesTextNumber.scala:
##########
@@ -46,8 +46,226 @@ case class ConvertTextCombinator(e: ElementBase, value: Gram, converter: Gram)
   override lazy val unparser = new ConvertTextCombinatorUnparser(e.termRuntimeData, value.unparser, converter.unparser)
 }
 
-case class ConvertTextNumberPrim(e: ElementBase)
-  extends Terminal(e, true) {
+// This is a separate object for unit testing purposes.
+private[primitives]
+object TextNumberPatternUtils {
+
+  /**
+   * A regex which matches textNumberPatterns that legally use the V
+   * (implied decimal point) character.
+   */
+   private[primitives] lazy val vregexStandard = {
+    //  DFDL v1.0 Spec says
+    //  It is a Schema Definition Error if any symbols other than "0", "1" through "9" or #
+    //  are used in the vpinteger region of the pattern.
+    //
+    // The prefix and suffix chars can surround the vpinteger region, and there can be
+    // a positive and a negative pattern.
+    //
+    // The prefix and suffix cannot be digits, # or P or V. No quoted chars in prefix either.
+    //
+    val prefix = """([^0-9#PV']?)"""
+    val suffix = prefix // same syntax as prefix
+    val vpinteger = """(#*[0-9]+)V([0-9]+)"""
+    val subpattern = s"""${prefix}${vpinteger}${suffix}"""
+     // don't forget the ^ and $ (start of data, end of data) because we want
+     // the match to consume all the characters, starting at the beginning.
+    val vPattern = s"""^${subpattern}(?:;${subpattern})?$$"""
+    val re = vPattern.r
+    re
+  }
+
+  def textDecimalVirtualPointForStandard(
+    e: ImplementsThrowsOrSavesSDE,
+    originalPattern: String,
+    patternStripped: String): Int = {
+    val r = TextNumberPatternUtils.vregexStandard
+    r.findFirstMatchIn(patternStripped) match {
+      case Some(r(_, _, afterV, _, _, _, _, _)) => afterV.length
+      case None =>
+        e.SDE(
+          s"""The dfdl:textNumberPattern '%s' contains 'V' (virtual decimal point).
+             | Other than the sign indicators, it can contain only
+             | '#', then digits 0-9 then 'V' then digits 0-9.
+             | The positive part of the dfdl:textNumberPattern is mandatory.""".stripMargin('|'),
+          originalPattern)
+    }
+  }
+
+  private[primitives] lazy val vregexZoned= {
+    // Note: for zoned, can only have a positive pattern
+    // Prefix or suffix can only be '+' character, to
+    // indicate leading or trailing sign, and can only
+    // one of those.
+    //
+    // Also we're not allowing the # character, since I think that
+    // makes no sense for zoned with virtual decimal point.
+    //
+    val vPattern = """^(\+?)([0-9]+)V([0-9]+)(\+?)$""" // only positive pattern allowed, only + for prefix/suffix
+    val re = vPattern.r
+    re
+  }
+
+  /**
+   * Checks a pattern for suitability with V (virtual decimal point) in the pattern
+   * in the context of zoned textNumberRep.
+   *
+   * This is broken out separately for unit testing purposes.
+   *
+   * @param pattern the dfdl:textNumberPattern pattern
+   * @return -1 if the pattern is illegal syntax for use with V (virtual decimal point)
+   *         otherwise the number of digits to the right of the V character.
+   */
+  private[primitives] def textDecimalVirtualPointForZoned(pattern: String): Option[Int] = {
+    val r = TextNumberPatternUtils.vregexZoned
+    r.findFirstMatchIn(pattern) match {
+      // note: cannot have both a prefix and suffix. Only one of them.
+      case Some(r(pre, _, afterV, suf)) if (pre.length + suf.length <= 1) => Some(afterV.length)
+      case _ => None
+    }
+  }
+
+  def textDecimalVirtualPointForZoned(
+    e: ImplementsThrowsOrSavesSDE,
+    originalPattern: String,
+    patternStripped: String): Int = {
+    textDecimalVirtualPointForZoned(patternStripped).getOrElse {
+      e.SDE(
+        s"""The dfdl:textNumberPattern '%s' contains 'V' (virtual decimal point).
+           |Other than the leading or trailing '+' sign indicator,
+           |it can contain only digits 0-9.""".stripMargin('|'),
+        originalPattern)
+    }
+  }
+
+  /**
+   * The apos character (') is the quoting character in ICU patterns (and DFDL textNumberPattern).
+   * This removes any unquoted P or V characters (which are not implemented by ICU)
+   * leaving a pattern string that is suitable for use with ICU.
+   * @param pattern the textNumberPattern string
+   * @return the pattern string with all unquoted P and unquoted V removed.
+   */
+  def removeUnquotedPV(pattern: String) : String = {
+    val uniqueQQString = "alsdflslkjskjslkkjlkkjfppooiipsldsflj"
+    // A single regex that matches an unquoted character
+    // where the quoting char is self quoting requires
+    // a zero-width look behind that matches a potentially
+    // unbounded number of quote chars. That's not allowed.
+    //
+    // Consider ''''''P. We want a regex that matches only the P here
+    // because it is preceded by an even number of quotes.
+    //
+    // I did some tests, and the formulations you think might work
+    // such as from stack-overflow, don't work.
+    // (See test howNotToUseRegexLookBehindWithReplaceAll)
+    //
+    // So we use this brute force technique of replacing all ''
+    // first, so we have only single quotes to deal with.
+    pattern.replaceAll("''", uniqueQQString).
+      replaceAll("(?<!')P", "").
+      replaceAll("(?<!')V", "").
+      replaceAll(uniqueQQString, "''")
+  }
+}
+
+trait ConvertTextNumberMixin {
+
+  protected def pattern: String
+
+
+  // The pattern with escaped characters removed.
+  // This can be reliably searched for pattern characters,
+  // but cannot be used as an operational pattern given that
+  // potentially lots of things have been removed from it.
+  final protected lazy val patternStripped = {
+    // note: tick == apos == ' == single quote.
+    // First remove entirely all escaped ticks ie., ''
+    val noEscapedTicksRegex = """''""".r
+    val patternNoEscapedTicks = noEscapedTicksRegex.replaceAllIn(pattern, "")
+    // Next remove all tick-escaped characters entirely
+    val noQuotedRegex = """'[^']+'""".r
+    val patternNoQuoted = noQuotedRegex.replaceAllIn(patternNoEscapedTicks, "")
+    // the remaining string contains only pattern special characters
+    // and regular non-pattern characters
+    patternNoQuoted
+  }
+
+  protected final lazy val hasV = patternStripped.contains("V")
+  protected final lazy val hasP = patternStripped.contains("P")
+
+  // analogous to the property dfdl:binaryDecimalVirtualPoint
+  protected final def computeTextDecimalVirtualPoint(e: ElementBase) = {
+    if (!hasV && !hasP) 0 // no virtual point
+    else {
+      if (hasP) {
+        e.notYetImplemented("textNumberPattern with P symbol")
+      }
+      Assert.invariant(hasV)
+      //
+      // check for things incompatible with "V"
+      //
+      val virtualPoint = e.textNumberRep match {
+        case TextNumberRep.Standard =>
+          TextNumberPatternUtils.textDecimalVirtualPointForStandard(e, pattern, patternStripped)
+        case TextNumberRep.Zoned =>
+          TextNumberPatternUtils.textDecimalVirtualPointForZoned(e, pattern, patternStripped)
+      }
+
+      Assert.invariant(virtualPoint >= 1) // if this fails the regex is broken.
+      virtualPoint
+    }
+  }
+
+  //
+  // We need 2 patterns
+  // 1. The original textNumberPattern string from the schema
+  // 2. Same but with the P or V removed.
+  //
+  // We need to use the original textNumberPattern in diagnostic messages
+  // since that's what the schema author wrote and expects to see.
+  // But if the pattern has P or V, then we need ICU at runtime to use the original
+  // but with the P or V removed since the P and V don't actually represent anything in the data.
+  final protected lazy val runtimePattern = {
+    if (hasV || hasP) {
+      val patternWithoutPV = TextNumberPatternUtils.removeUnquotedPV(pattern)
+      patternWithoutPV
+    }
+    else pattern
+  }
+
+  final protected def checkPatternWithICU(e: ElementBase) = {
+    // Load the pattern to make sure it is valid
+    try {
+      if (hasV || hasP) {
+        new DecimalFormat(runtimePattern)
+      } else {
+        new DecimalFormat(pattern)
+      }
+    } catch {
+      case ex: IllegalArgumentException =>
+        if (hasV || hasP) {
+          // we don't know what the diagnostic message will say here
+          // since it is from the ICU library.
+          // That library has only seen the pattern with the P and V
+          // removed. The messages might show that pattern which would
+          // confuse the schema author since that's not the pattern
+          // they wrote.
+          //
+          // So we try to explain....
+          e.SDE(
+            """Invalid textNumberPattern.
+              | The errors are about the pattern with the P (decimal scaling position)
+              | and V (virtual decimal point) characters removed: %s""".stripMargin, ex)

Review Comment:
   I am actually not sure it's possible to exercise this path. The regexs for the V pattern (and P eventually) constrain things pretty strongly in terms of the pattern that's left when it is handed to ICU. 
   
   So this is more of a "just in case" there is something ICU rejects even after we've pre-conditioned it by removing the P and V. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [daffodil] mbeckerle commented on a diff in pull request #900: Implement textNumberPattern 'V' virtual decimal point.

Posted by GitBox <gi...@apache.org>.
mbeckerle commented on code in PR #900:
URL: https://github.com/apache/daffodil/pull/900#discussion_r1059405800


##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/processors/parsers/PrimitivesTextNumber1.scala:
##########
@@ -54,11 +56,58 @@ case class ConvertTextCombinatorParser(
   }
 }
 
+trait TextDecimalVirtualPointMixin {
+  def textDecimalVirtualPoint: Int
+
+  final protected val divisor = scala.math.pow(10.0, textDecimalVirtualPoint)
+  final protected val multiplier = scala.math.pow(10.0, textDecimalVirtualPoint)

Review Comment:
   Ha! it's really the "magic number" and one code path multiplies by it. the other divides by it. I'll explain. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [daffodil] mbeckerle commented on a diff in pull request #900: Draft: Work in progress - Implement textNumberPattern 'V' virtual decimal point.

Posted by GitBox <gi...@apache.org>.
mbeckerle commented on code in PR #900:
URL: https://github.com/apache/daffodil/pull/900#discussion_r1055999972


##########
daffodil-core/src/main/scala/org/apache/daffodil/grammar/primitives/PrimitivesTextNumber.scala:
##########
@@ -46,6 +46,85 @@ case class ConvertTextCombinator(e: ElementBase, value: Gram, converter: Gram)
   override lazy val unparser = new ConvertTextCombinatorUnparser(e.termRuntimeData, value.unparser, converter.unparser)
 }
 
+/**
+ * A regex which matches textNumberPatterns that legally use the V
+ * (implied decimal point) character.
+ */
+object TextNumberPatternUtils {
+
+   private[primitives] lazy val vregexStandard = {
+    //  DFDL v1.0 Spec says
+    //  It is a Schema Definition Error if any symbols other than "0", "1" through "9" or #
+    //  are used in the vpinteger region of the pattern.
+    //
+    // The prefix and suffix chars can surround the vpinteger region, and there can be
+    // a positive and a negative pattern.
+    //
+    // The prefix and suffix cannot be digits, # or P or V
+    //
+    val prefix = """([^0-9#PV]?)"""

Review Comment:
   Can be stronger. prefix and suffix can only contain anything other than the special characters of the pattern language. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [daffodil] mbeckerle commented on pull request #900: Implement textNumberPattern 'V' virtual decimal point.

Posted by GitBox <gi...@apache.org>.
mbeckerle commented on PR #900:
URL: https://github.com/apache/daffodil/pull/900#issuecomment-1364344043

   Ready for review now. 
   Works for textNumberRep standard and zoned. Has tests for both. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [daffodil] mbeckerle merged pull request #900: Implement textNumberPattern 'V' virtual decimal point.

Posted by GitBox <gi...@apache.org>.
mbeckerle merged PR #900:
URL: https://github.com/apache/daffodil/pull/900


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [daffodil] mbeckerle commented on a diff in pull request #900: Implement textNumberPattern 'V' virtual decimal point.

Posted by GitBox <gi...@apache.org>.
mbeckerle commented on code in PR #900:
URL: https://github.com/apache/daffodil/pull/900#discussion_r1067579484


##########
daffodil-core/src/main/scala/org/apache/daffodil/grammar/primitives/PrimitivesTextNumber.scala:
##########
@@ -46,8 +45,249 @@ case class ConvertTextCombinator(e: ElementBase, value: Gram, converter: Gram)
   override lazy val unparser = new ConvertTextCombinatorUnparser(e.termRuntimeData, value.unparser, converter.unparser)
 }
 
-case class ConvertTextNumberPrim(e: ElementBase)
-  extends Terminal(e, true) {
+// This is a separate object for unit testing purposes.
+private[primitives]
+object TextNumberPatternUtils {
+
+  /**
+   * A regex which matches textNumberPatterns that legally use the V
+   * (implied decimal point) character.
+   */
+   private[primitives] lazy val vregexStandard = {
+    //  DFDL v1.0 Spec says
+    //  It is a Schema Definition Error if any symbols other than "0", "1" through "9" or #
+    //  are used in the vpinteger region of the pattern.
+    //
+    // The prefix and suffix chars can surround the vpinteger region, and there can be
+    // a positive and a negative pattern.
+    //
+    // The prefix and suffix cannot be digits, # or P or V. No quoted chars in prefix either.
+    //
+     val prefixChars = """[^0-9#PV']?""" // same for suffix.
+     val beforeVChars = """#*[0-9]+"""
+     val afterVChars = """[0-9]+"""
+     //
+     // Each of the capture groups is defined here in order
+     //
+     val posPrefix = s"""($prefixChars)"""
+     val posBeforeV = s"""($beforeVChars)"""
+     val posAfterV = s"""($afterVChars)"""
+     val posSuffix = posPrefix
+     val negPrefix = posPrefix
+     val negBeforeV = posBeforeV
+     val negAfterV = posAfterV
+     val negSuffix = posPrefix
+     // don't forget the ^ and $ (start of data, end of data) because we want
+     // the match to consume all the characters, starting at the beginning.
+    val vPattern=
+     s"""^${posPrefix}${posBeforeV}V${posAfterV}${posSuffix}(?:;${negPrefix}${negBeforeV}V${negAfterV}${negSuffix})?$$"""
+    val re = vPattern.r
+    re
+  }
+
+
+
+  private[primitives] lazy val vregexZoned= {
+    // Note: for zoned, can only have a positive pattern
+    // Prefix or suffix can only be '+' character, to
+    // indicate leading or trailing sign, and can only
+    // one of those.
+    //
+    // Also we're not allowing the # character, since I think that
+    // makes no sense for zoned with virtual decimal point.
+    //
+    val prefixChars = """\+?""" // only + for prefix/suffix
+    val aroundVChars = """[0-9]+"""
+    //
+    // capture groups are defined here in sequence
+    //
+    val prefix = s"""($prefixChars)"""
+    val beforeV = s"""($aroundVChars)"""
+    val afterV = beforeV
+    val suffix = prefix
+    val vPattern = s"""^${prefix}${beforeV}V${afterV}${suffix}$$""" // only positive pattern allowed
+    val re = vPattern.r
+    re
+  }
+
+  /**
+   * Checks a pattern for suitability with V (virtual decimal point) in the pattern
+   * in the context of zoned textNumberRep.
+   *
+   * This is broken out separately for unit testing purposes.

Review Comment:
   Addressed in https://github.com/apache/daffodil/pull/910



##########
daffodil-core/src/main/scala/org/apache/daffodil/grammar/primitives/PrimitivesTextNumber.scala:
##########
@@ -46,8 +45,249 @@ case class ConvertTextCombinator(e: ElementBase, value: Gram, converter: Gram)
   override lazy val unparser = new ConvertTextCombinatorUnparser(e.termRuntimeData, value.unparser, converter.unparser)
 }
 
-case class ConvertTextNumberPrim(e: ElementBase)
-  extends Terminal(e, true) {
+// This is a separate object for unit testing purposes.
+private[primitives]
+object TextNumberPatternUtils {
+
+  /**
+   * A regex which matches textNumberPatterns that legally use the V
+   * (implied decimal point) character.
+   */
+   private[primitives] lazy val vregexStandard = {
+    //  DFDL v1.0 Spec says
+    //  It is a Schema Definition Error if any symbols other than "0", "1" through "9" or #
+    //  are used in the vpinteger region of the pattern.
+    //
+    // The prefix and suffix chars can surround the vpinteger region, and there can be
+    // a positive and a negative pattern.
+    //
+    // The prefix and suffix cannot be digits, # or P or V. No quoted chars in prefix either.
+    //
+     val prefixChars = """[^0-9#PV']?""" // same for suffix.
+     val beforeVChars = """#*[0-9]+"""
+     val afterVChars = """[0-9]+"""
+     //
+     // Each of the capture groups is defined here in order
+     //
+     val posPrefix = s"""($prefixChars)"""
+     val posBeforeV = s"""($beforeVChars)"""
+     val posAfterV = s"""($afterVChars)"""
+     val posSuffix = posPrefix
+     val negPrefix = posPrefix
+     val negBeforeV = posBeforeV
+     val negAfterV = posAfterV
+     val negSuffix = posPrefix
+     // don't forget the ^ and $ (start of data, end of data) because we want
+     // the match to consume all the characters, starting at the beginning.
+    val vPattern=
+     s"""^${posPrefix}${posBeforeV}V${posAfterV}${posSuffix}(?:;${negPrefix}${negBeforeV}V${negAfterV}${negSuffix})?$$"""
+    val re = vPattern.r
+    re
+  }
+
+
+
+  private[primitives] lazy val vregexZoned= {
+    // Note: for zoned, can only have a positive pattern
+    // Prefix or suffix can only be '+' character, to
+    // indicate leading or trailing sign, and can only
+    // one of those.
+    //
+    // Also we're not allowing the # character, since I think that
+    // makes no sense for zoned with virtual decimal point.
+    //
+    val prefixChars = """\+?""" // only + for prefix/suffix
+    val aroundVChars = """[0-9]+"""
+    //
+    // capture groups are defined here in sequence
+    //
+    val prefix = s"""($prefixChars)"""
+    val beforeV = s"""($aroundVChars)"""
+    val afterV = beforeV
+    val suffix = prefix
+    val vPattern = s"""^${prefix}${beforeV}V${afterV}${suffix}$$""" // only positive pattern allowed
+    val re = vPattern.r
+    re
+  }
+
+  /**
+   * Checks a pattern for suitability with V (virtual decimal point) in the pattern
+   * in the context of zoned textNumberRep.
+   *
+   * This is broken out separately for unit testing purposes.
+   *
+   * @param patternStripped the dfdl:textNumberPattern pattern - with all quoting removed.
+   * @return None if the pattern is illegal syntax for use with V (virtual decimal point)
+   *         otherwise Some(N) where N is the number of digits to the right of the V character.
+   */
+  private[primitives] def textDecimalVirtualPointForZoned(patternStripped: String): Option[Int] = {
+    val r = TextNumberPatternUtils.vregexZoned
+    r.findFirstMatchIn(patternStripped) match {
+      // note: cannot have both a prefix and suffix. Only one of them.
+      case Some(r(pre, _, afterV, suf)) if (pre.length + suf.length <= 1) => Some(afterV.length)
+      case _ => None
+    }
+  }
+
+  /**
+   * The apos character (') is the quoting character in ICU patterns (and DFDL textNumberPattern).
+   * This removes any unquoted P or V characters (which are not implemented by ICU)
+   * leaving a pattern string that is suitable for use with ICU.
+   * @param pattern the textNumberPattern string
+   * @return the pattern string with all unquoted P and unquoted V removed.
+   */
+  def removeUnquotedPV(pattern: String) : String = {
+    val uniqueQQString = "alsdflslkjskjslkkjlkkjfppooiipsldsflj"
+    // A single regex that matches an unquoted character
+    // where the quoting char is self quoting requires
+    // a zero-width look behind that matches a potentially
+    // unbounded number of quote chars. That's not allowed.
+    //
+    // Consider ''''''P. We want a regex that matches only the P here
+    // because it is preceded by an even number of quotes.
+    //
+    // I did some tests, and the formulations you think might work
+    // such as from stack-overflow, don't work.
+    // (See test howNotToUseRegexLookBehindWithReplaceAll)
+    //
+    // So we use this brute force technique of replacing all ''
+    // first, so we have only single quotes to deal with.
+    pattern.replaceAll("''", uniqueQQString).
+      replaceAll("(?<!')P", "").
+      replaceAll("(?<!')V", "").
+      replaceAll(uniqueQQString, "''")
+  }
+}
+
+trait ConvertTextNumberMixin {
+
+  def e: ElementBase
+
+  /**
+   * Convenience. The original value of textNumberPattern
+   */
+  @inline
+  final protected def pattern = e.textNumberPattern
+
+  /**
+   * Checks a pattern for suitability with V (virtual decimal point) in the pattern
+   * in the context of the corresponding textNumberRep. Computes number of digits to right of the V.
+   *
+   * SDE if pattern has a syntax error.
+   *
+   * @return the number of digits to the right of the V character. Must be 1 or greater.
+   *
+   */
+  protected def textDecimalVirtualPointFromPattern: Int
+
+  /**
+   * The pattern with escaped characters removed.
+   * This can be reliably searched for pattern characters,
+   * but cannot be used as an operational pattern given that
+   * potentially lots of things have been removed from it.
+   */
+  final protected lazy val patternStripped = {
+    // note: tick == apos == ' == single quote.
+    // First remove entirely all escaped ticks ie., ''
+    val noEscapedTicksRegex = """''""".r
+    val patternNoEscapedTicks = noEscapedTicksRegex.replaceAllIn(pattern, "")
+    // Next remove all tick-escaped characters entirely
+    val noQuotedRegex = """'[^']+'""".r
+    val patternNoQuoted = noQuotedRegex.replaceAllIn(patternNoEscapedTicks, "")
+    // the remaining string contains only pattern special characters
+    // and regular non-pattern characters
+    patternNoQuoted
+  }
+
+  protected final lazy val hasV = patternStripped.contains("V")
+  protected final lazy val hasP = patternStripped.contains("P")
+
+  /**
+   * analogous to the property dfdl:binaryDecimalVirtualPoint
+   *
+   * Value is 0 if there is no virtual decimal point.
+   * Value is the number of digits to the right of the 'V'
+   */
+  final lazy val textDecimalVirtualPoint: Int = {
+    if (!hasV && !hasP) 0 // no virtual point
+    else {
+      if (hasP) {
+        e.notYetImplemented("textNumberPattern with P symbol")
+      }
+      Assert.invariant(hasV)
+      //
+      // check for things incompatible with "V"
+      //
+      val virtualPoint = textDecimalVirtualPointFromPattern
+      Assert.invariant(virtualPoint >= 1) // if this fails the regex is broken.
+      virtualPoint
+    }
+  }
+
+  //
+  // We need 2 patterns
+  // 1. The original textNumberPattern string from the schema
+  // 2. Same but with the P or V removed.
+  //
+  // We need to use the original textNumberPattern in diagnostic messages
+  // since that's what the schema author wrote and expects to see.
+  // But if the pattern has P or V, then we need ICU at runtime to use the original
+  // but with the P or V removed since the P and V don't actually represent anything in the data.
+  final protected lazy val runtimePattern = {
+    if (hasV || hasP) {
+      val patternWithoutPV = TextNumberPatternUtils.removeUnquotedPV(pattern)
+      patternWithoutPV
+    }
+    else pattern
+  }
+
+  final protected def checkPatternWithICU(e: ElementBase) = {
+    // Load the pattern to make sure it is valid
+    try {
+      if (hasV || hasP) {
+        new DecimalFormat(runtimePattern)
+      } else {
+        new DecimalFormat(pattern)
+      }
+    } catch {
+      case ex: IllegalArgumentException =>
+        if (hasV || hasP) {
+          // we don't know what the diagnostic message will say here
+          // since it is from the ICU library.
+          // That library has only seen the pattern with the P and V
+          // removed. The messages might show that pattern which would
+          // confuse the schema author since that's not the pattern
+          // they wrote.
+          //
+          // So we try to explain....
+          e.SDE(
+            """Invalid textNumberPattern.
+              | The errors are about the pattern with the P (decimal scaling position)
+              | and V (virtual decimal point) characters removed: %s""".stripMargin, ex)
+        } else {
+          e.SDE("Invalid textNumberPattern: %s", ex)
+        }
+    }
+  }
+}
+
+case class ConvertTextStandardNumberPrim(e: ElementBase)
+  extends Terminal(e, true)
+  with ConvertTextNumberMixin {
+
+  final override protected lazy val textDecimalVirtualPointFromPattern: Int = {
+    val r = TextNumberPatternUtils.vregexStandard
+    r.findFirstMatchIn(patternStripped) match {
+      case Some(r(_, _, afterV, _, _, _, _, _)) => afterV.length
+      case None =>
+        e.SDE(
+          s"""The dfdl:textNumberPattern '%s' contains 'V' (virtual decimal point).

Review Comment:
   This is only called in a context where hasV is true. 
   I will add an Assert.invariant(hasV) to this to make that clear. 
   Failure to match this regex at all means that the pattern string DOES have a V, but doesn't otherwise obey the rules for patterns with V. 



##########
daffodil-core/src/main/scala/org/apache/daffodil/grammar/primitives/PrimitivesTextNumber.scala:
##########
@@ -46,8 +45,249 @@ case class ConvertTextCombinator(e: ElementBase, value: Gram, converter: Gram)
   override lazy val unparser = new ConvertTextCombinatorUnparser(e.termRuntimeData, value.unparser, converter.unparser)
 }
 
-case class ConvertTextNumberPrim(e: ElementBase)
-  extends Terminal(e, true) {
+// This is a separate object for unit testing purposes.
+private[primitives]
+object TextNumberPatternUtils {
+
+  /**
+   * A regex which matches textNumberPatterns that legally use the V
+   * (implied decimal point) character.
+   */
+   private[primitives] lazy val vregexStandard = {
+    //  DFDL v1.0 Spec says
+    //  It is a Schema Definition Error if any symbols other than "0", "1" through "9" or #
+    //  are used in the vpinteger region of the pattern.
+    //
+    // The prefix and suffix chars can surround the vpinteger region, and there can be
+    // a positive and a negative pattern.
+    //
+    // The prefix and suffix cannot be digits, # or P or V. No quoted chars in prefix either.
+    //
+     val prefixChars = """[^0-9#PV']?""" // same for suffix.
+     val beforeVChars = """#*[0-9]+"""
+     val afterVChars = """[0-9]+"""
+     //
+     // Each of the capture groups is defined here in order
+     //
+     val posPrefix = s"""($prefixChars)"""
+     val posBeforeV = s"""($beforeVChars)"""
+     val posAfterV = s"""($afterVChars)"""
+     val posSuffix = posPrefix
+     val negPrefix = posPrefix
+     val negBeforeV = posBeforeV
+     val negAfterV = posAfterV
+     val negSuffix = posPrefix
+     // don't forget the ^ and $ (start of data, end of data) because we want
+     // the match to consume all the characters, starting at the beginning.
+    val vPattern=
+     s"""^${posPrefix}${posBeforeV}V${posAfterV}${posSuffix}(?:;${negPrefix}${negBeforeV}V${negAfterV}${negSuffix})?$$"""
+    val re = vPattern.r
+    re
+  }
+
+
+
+  private[primitives] lazy val vregexZoned= {
+    // Note: for zoned, can only have a positive pattern
+    // Prefix or suffix can only be '+' character, to
+    // indicate leading or trailing sign, and can only
+    // one of those.
+    //
+    // Also we're not allowing the # character, since I think that
+    // makes no sense for zoned with virtual decimal point.
+    //
+    val prefixChars = """\+?""" // only + for prefix/suffix
+    val aroundVChars = """[0-9]+"""
+    //
+    // capture groups are defined here in sequence
+    //
+    val prefix = s"""($prefixChars)"""
+    val beforeV = s"""($aroundVChars)"""
+    val afterV = beforeV
+    val suffix = prefix
+    val vPattern = s"""^${prefix}${beforeV}V${afterV}${suffix}$$""" // only positive pattern allowed
+    val re = vPattern.r
+    re
+  }
+
+  /**
+   * Checks a pattern for suitability with V (virtual decimal point) in the pattern
+   * in the context of zoned textNumberRep.
+   *
+   * This is broken out separately for unit testing purposes.
+   *
+   * @param patternStripped the dfdl:textNumberPattern pattern - with all quoting removed.
+   * @return None if the pattern is illegal syntax for use with V (virtual decimal point)
+   *         otherwise Some(N) where N is the number of digits to the right of the V character.
+   */
+  private[primitives] def textDecimalVirtualPointForZoned(patternStripped: String): Option[Int] = {
+    val r = TextNumberPatternUtils.vregexZoned
+    r.findFirstMatchIn(patternStripped) match {
+      // note: cannot have both a prefix and suffix. Only one of them.
+      case Some(r(pre, _, afterV, suf)) if (pre.length + suf.length <= 1) => Some(afterV.length)
+      case _ => None
+    }
+  }
+
+  /**
+   * The apos character (') is the quoting character in ICU patterns (and DFDL textNumberPattern).
+   * This removes any unquoted P or V characters (which are not implemented by ICU)
+   * leaving a pattern string that is suitable for use with ICU.
+   * @param pattern the textNumberPattern string
+   * @return the pattern string with all unquoted P and unquoted V removed.
+   */
+  def removeUnquotedPV(pattern: String) : String = {
+    val uniqueQQString = "alsdflslkjskjslkkjlkkjfppooiipsldsflj"
+    // A single regex that matches an unquoted character
+    // where the quoting char is self quoting requires
+    // a zero-width look behind that matches a potentially
+    // unbounded number of quote chars. That's not allowed.
+    //
+    // Consider ''''''P. We want a regex that matches only the P here
+    // because it is preceded by an even number of quotes.
+    //
+    // I did some tests, and the formulations you think might work
+    // such as from stack-overflow, don't work.
+    // (See test howNotToUseRegexLookBehindWithReplaceAll)
+    //
+    // So we use this brute force technique of replacing all ''
+    // first, so we have only single quotes to deal with.
+    pattern.replaceAll("''", uniqueQQString).
+      replaceAll("(?<!')P", "").
+      replaceAll("(?<!')V", "").

Review Comment:
   I will do something like this. 



##########
daffodil-core/src/main/scala/org/apache/daffodil/grammar/primitives/PrimitivesTextNumber.scala:
##########
@@ -46,8 +45,249 @@ case class ConvertTextCombinator(e: ElementBase, value: Gram, converter: Gram)
   override lazy val unparser = new ConvertTextCombinatorUnparser(e.termRuntimeData, value.unparser, converter.unparser)
 }
 
-case class ConvertTextNumberPrim(e: ElementBase)
-  extends Terminal(e, true) {
+// This is a separate object for unit testing purposes.

Review Comment:
   Addressed in https://github.com/apache/daffodil/pull/910



##########
daffodil-core/src/main/scala/org/apache/daffodil/grammar/primitives/PrimitivesZoned.scala:
##########
@@ -47,71 +45,67 @@ case class ConvertZonedCombinator(e: ElementBase, value: Gram, converter: Gram)
 }
 
 case class ConvertZonedNumberPrim(e: ElementBase)
-  extends Terminal(e, true) {
-
-  val textNumberFormatEv: TextNumberFormatEv = {
-    val pattern = {
-      val p = e.textNumberPattern
+  extends Terminal(e, true)
+  with ConvertTextNumberMixin {
+
+  final override protected lazy val textDecimalVirtualPointFromPattern: Int = {
+    TextNumberPatternUtils.textDecimalVirtualPointForZoned(patternStripped).getOrElse {

Review Comment:
   Yes. Will rename. 



##########
daffodil-core/src/main/scala/org/apache/daffodil/grammar/primitives/PrimitivesTextNumber.scala:
##########
@@ -46,8 +45,249 @@ case class ConvertTextCombinator(e: ElementBase, value: Gram, converter: Gram)
   override lazy val unparser = new ConvertTextCombinatorUnparser(e.termRuntimeData, value.unparser, converter.unparser)
 }
 
-case class ConvertTextNumberPrim(e: ElementBase)
-  extends Terminal(e, true) {
+// This is a separate object for unit testing purposes.
+private[primitives]
+object TextNumberPatternUtils {
+
+  /**
+   * A regex which matches textNumberPatterns that legally use the V
+   * (implied decimal point) character.
+   */
+   private[primitives] lazy val vregexStandard = {
+    //  DFDL v1.0 Spec says
+    //  It is a Schema Definition Error if any symbols other than "0", "1" through "9" or #
+    //  are used in the vpinteger region of the pattern.
+    //
+    // The prefix and suffix chars can surround the vpinteger region, and there can be
+    // a positive and a negative pattern.
+    //
+    // The prefix and suffix cannot be digits, # or P or V. No quoted chars in prefix either.
+    //
+     val prefixChars = """[^0-9#PV']?""" // same for suffix.
+     val beforeVChars = """#*[0-9]+"""
+     val afterVChars = """[0-9]+"""
+     //
+     // Each of the capture groups is defined here in order
+     //
+     val posPrefix = s"""($prefixChars)"""
+     val posBeforeV = s"""($beforeVChars)"""
+     val posAfterV = s"""($afterVChars)"""
+     val posSuffix = posPrefix
+     val negPrefix = posPrefix
+     val negBeforeV = posBeforeV
+     val negAfterV = posAfterV
+     val negSuffix = posPrefix
+     // don't forget the ^ and $ (start of data, end of data) because we want
+     // the match to consume all the characters, starting at the beginning.
+    val vPattern=
+     s"""^${posPrefix}${posBeforeV}V${posAfterV}${posSuffix}(?:;${negPrefix}${negBeforeV}V${negAfterV}${negSuffix})?$$"""
+    val re = vPattern.r
+    re
+  }
+
+
+
+  private[primitives] lazy val vregexZoned= {
+    // Note: for zoned, can only have a positive pattern
+    // Prefix or suffix can only be '+' character, to
+    // indicate leading or trailing sign, and can only
+    // one of those.
+    //
+    // Also we're not allowing the # character, since I think that
+    // makes no sense for zoned with virtual decimal point.
+    //
+    val prefixChars = """\+?""" // only + for prefix/suffix
+    val aroundVChars = """[0-9]+"""
+    //
+    // capture groups are defined here in sequence
+    //
+    val prefix = s"""($prefixChars)"""
+    val beforeV = s"""($aroundVChars)"""
+    val afterV = beforeV
+    val suffix = prefix
+    val vPattern = s"""^${prefix}${beforeV}V${afterV}${suffix}$$""" // only positive pattern allowed
+    val re = vPattern.r
+    re
+  }
+
+  /**
+   * Checks a pattern for suitability with V (virtual decimal point) in the pattern
+   * in the context of zoned textNumberRep.
+   *
+   * This is broken out separately for unit testing purposes.
+   *
+   * @param patternStripped the dfdl:textNumberPattern pattern - with all quoting removed.
+   * @return None if the pattern is illegal syntax for use with V (virtual decimal point)
+   *         otherwise Some(N) where N is the number of digits to the right of the V character.
+   */
+  private[primitives] def textDecimalVirtualPointForZoned(patternStripped: String): Option[Int] = {
+    val r = TextNumberPatternUtils.vregexZoned
+    r.findFirstMatchIn(patternStripped) match {
+      // note: cannot have both a prefix and suffix. Only one of them.
+      case Some(r(pre, _, afterV, suf)) if (pre.length + suf.length <= 1) => Some(afterV.length)
+      case _ => None
+    }
+  }
+
+  /**
+   * The apos character (') is the quoting character in ICU patterns (and DFDL textNumberPattern).
+   * This removes any unquoted P or V characters (which are not implemented by ICU)
+   * leaving a pattern string that is suitable for use with ICU.
+   * @param pattern the textNumberPattern string
+   * @return the pattern string with all unquoted P and unquoted V removed.
+   */
+  def removeUnquotedPV(pattern: String) : String = {
+    val uniqueQQString = "alsdflslkjskjslkkjlkkjfppooiipsldsflj"
+    // A single regex that matches an unquoted character
+    // where the quoting char is self quoting requires
+    // a zero-width look behind that matches a potentially
+    // unbounded number of quote chars. That's not allowed.
+    //
+    // Consider ''''''P. We want a regex that matches only the P here
+    // because it is preceded by an even number of quotes.
+    //
+    // I did some tests, and the formulations you think might work
+    // such as from stack-overflow, don't work.
+    // (See test howNotToUseRegexLookBehindWithReplaceAll)
+    //
+    // So we use this brute force technique of replacing all ''
+    // first, so we have only single quotes to deal with.
+    pattern.replaceAll("''", uniqueQQString).
+      replaceAll("(?<!')P", "").
+      replaceAll("(?<!')V", "").

Review Comment:
   This is addressed in https://github.com/apache/daffodil/pull/910 



##########
daffodil-core/src/main/scala/org/apache/daffodil/grammar/primitives/PrimitivesZoned.scala:
##########
@@ -47,71 +45,67 @@ case class ConvertZonedCombinator(e: ElementBase, value: Gram, converter: Gram)
 }
 
 case class ConvertZonedNumberPrim(e: ElementBase)
-  extends Terminal(e, true) {
-
-  val textNumberFormatEv: TextNumberFormatEv = {
-    val pattern = {
-      val p = e.textNumberPattern
+  extends Terminal(e, true)
+  with ConvertTextNumberMixin {
+
+  final override protected lazy val textDecimalVirtualPointFromPattern: Int = {
+    TextNumberPatternUtils.textDecimalVirtualPointForZoned(patternStripped).getOrElse {
+      e.SDE(
+        s"""The dfdl:textNumberPattern '%s' contains 'V' (virtual decimal point).

Review Comment:
   Only called when hasV is true. Will add assertion to make that clear. 



##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/processors/parsers/ConvertTextStandardNumberParser.scala:
##########
@@ -54,11 +56,64 @@ case class ConvertTextCombinatorParser(
   }
 }
 
-case class ConvertTextNumberParser(
+trait TextDecimalVirtualPointMixin {
+  def textDecimalVirtualPoint: Int
+
+  final protected val virtualPointScaleFactor = scala.math.pow(10.0, textDecimalVirtualPoint)
+
+  final protected def applyTextDecimalVirtualPointForParse(num1: JNumber): JNumber = {
+    if (textDecimalVirtualPoint == 0) num1
+    else {
+      // scale for virtual decimal point
+      val scaledNum: JNumber = num1 match {
+        // Empirically, in our test suite, we do get Long back here, so the runtime sometimes represents small integer
+        // (or possibly even smaller decimal numbers with no fraction part) as Long.
+        case l: JLong => JBigDecimal.valueOf(l).scaleByPowerOfTen(-textDecimalVirtualPoint)
+        case bd: JBigDecimal => bd.scaleByPowerOfTen(-textDecimalVirtualPoint)
+        case f: JFloat => (f / virtualPointScaleFactor).toFloat
+        case d: JDouble => (d / virtualPointScaleFactor).toDouble
+        // $COVERAGE-OFF$
+        case _ => badType(num1)
+        // $COVERAGE-ON$
+      }
+      scaledNum
+    }
+  }
+
+  // $COVERAGE-OFF$
+  private def badType(num1: AnyRef) = {
+    Assert.invariantFailed(
+      s"""Number cannot be scaled for virtual decimal point,
+         |because it is not a decimal, float, or double.
+         |The type is ${num1.getClass.getSimpleName}.""".stripMargin)
+  }
+  // $COVERAGE-ON$
+
+  final protected def applyTextDecimalVirtualPointForUnparse(valueAsAnyRef: AnyRef) : JNumber = {
+    valueAsAnyRef match {
+      // This is not perfectly symmetrical with the parse side equivalent.
+      // Empirically in our test suite, we do not see JLong here.
+      case f: JFloat => (f * virtualPointScaleFactor).toFloat
+      case d: JDouble => (d * virtualPointScaleFactor).toDouble
+      case bd: JBigDecimal => bd.scaleByPowerOfTen(textDecimalVirtualPoint)
+
+      case n: JNumber =>
+        if (textDecimalVirtualPoint == 0) n
+        // $COVERAGE-OFF$ // both badType and the next case are coverage-off
+        else badType(n)
+      case _ => Assert.invariantFailed("Not a JNumber")
+      // $COVERAGE-ON$
+    }
+  }

Review Comment:
   Returns the type of the argument. 
   NaN and INF floating point values mean this can't just clamp to an Integer. 



##########
daffodil-core/src/main/scala/org/apache/daffodil/grammar/primitives/PrimitivesZoned.scala:
##########
@@ -47,71 +45,67 @@ case class ConvertZonedCombinator(e: ElementBase, value: Gram, converter: Gram)
 }
 
 case class ConvertZonedNumberPrim(e: ElementBase)
-  extends Terminal(e, true) {
-
-  val textNumberFormatEv: TextNumberFormatEv = {
-    val pattern = {
-      val p = e.textNumberPattern
+  extends Terminal(e, true)
+  with ConvertTextNumberMixin {
+
+  final override protected lazy val textDecimalVirtualPointFromPattern: Int = {
+    TextNumberPatternUtils.textDecimalVirtualPointForZoned(patternStripped).getOrElse {
+      e.SDE(
+        s"""The dfdl:textNumberPattern '%s' contains 'V' (virtual decimal point).

Review Comment:
   Resolved in https://github.com/apache/daffodil/pull/910



##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/processors/parsers/ConvertTextStandardNumberParser.scala:
##########
@@ -54,11 +56,64 @@ case class ConvertTextCombinatorParser(
   }
 }
 
-case class ConvertTextNumberParser(
+trait TextDecimalVirtualPointMixin {
+  def textDecimalVirtualPoint: Int
+
+  final protected val virtualPointScaleFactor = scala.math.pow(10.0, textDecimalVirtualPoint)

Review Comment:
   lazy it is. 



##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/processors/parsers/ConvertTextStandardNumberParser.scala:
##########
@@ -54,11 +56,64 @@ case class ConvertTextCombinatorParser(
   }
 }
 
-case class ConvertTextNumberParser(
+trait TextDecimalVirtualPointMixin {
+  def textDecimalVirtualPoint: Int
+
+  final protected val virtualPointScaleFactor = scala.math.pow(10.0, textDecimalVirtualPoint)
+
+  final protected def applyTextDecimalVirtualPointForParse(num1: JNumber): JNumber = {
+    if (textDecimalVirtualPoint == 0) num1
+    else {
+      // scale for virtual decimal point
+      val scaledNum: JNumber = num1 match {
+        // Empirically, in our test suite, we do get Long back here, so the runtime sometimes represents small integer
+        // (or possibly even smaller decimal numbers with no fraction part) as Long.
+        case l: JLong => JBigDecimal.valueOf(l).scaleByPowerOfTen(-textDecimalVirtualPoint)
+        case bd: JBigDecimal => bd.scaleByPowerOfTen(-textDecimalVirtualPoint)
+        case f: JFloat => (f / virtualPointScaleFactor).toFloat
+        case d: JDouble => (d / virtualPointScaleFactor).toDouble
+        // $COVERAGE-OFF$
+        case _ => badType(num1)
+        // $COVERAGE-ON$
+      }
+      scaledNum
+    }
+  }
+
+  // $COVERAGE-OFF$
+  private def badType(num1: AnyRef) = {
+    Assert.invariantFailed(
+      s"""Number cannot be scaled for virtual decimal point,
+         |because it is not a decimal, float, or double.
+         |The type is ${num1.getClass.getSimpleName}.""".stripMargin)
+  }
+  // $COVERAGE-ON$
+
+  final protected def applyTextDecimalVirtualPointForUnparse(valueAsAnyRef: AnyRef) : JNumber = {
+    valueAsAnyRef match {
+      // This is not perfectly symmetrical with the parse side equivalent.
+      // Empirically in our test suite, we do not see JLong here.
+      case f: JFloat => (f * virtualPointScaleFactor).toFloat
+      case d: JDouble => (d * virtualPointScaleFactor).toDouble
+      case bd: JBigDecimal => bd.scaleByPowerOfTen(textDecimalVirtualPoint)
+
+      case n: JNumber =>
+        if (textDecimalVirtualPoint == 0) n
+        // $COVERAGE-OFF$ // both badType and the next case are coverage-off
+        else badType(n)
+      case _ => Assert.invariantFailed("Not a JNumber")
+      // $COVERAGE-ON$
+    }
+  }

Review Comment:
   Changed and comments added about this in https://github.com/apache/daffodil/pull/910.



##########
daffodil-core/src/main/scala/org/apache/daffodil/grammar/primitives/PrimitivesTextNumber.scala:
##########
@@ -46,8 +45,249 @@ case class ConvertTextCombinator(e: ElementBase, value: Gram, converter: Gram)
   override lazy val unparser = new ConvertTextCombinatorUnparser(e.termRuntimeData, value.unparser, converter.unparser)
 }
 
-case class ConvertTextNumberPrim(e: ElementBase)
-  extends Terminal(e, true) {
+// This is a separate object for unit testing purposes.
+private[primitives]
+object TextNumberPatternUtils {
+
+  /**
+   * A regex which matches textNumberPatterns that legally use the V
+   * (implied decimal point) character.
+   */
+   private[primitives] lazy val vregexStandard = {
+    //  DFDL v1.0 Spec says
+    //  It is a Schema Definition Error if any symbols other than "0", "1" through "9" or #
+    //  are used in the vpinteger region of the pattern.
+    //
+    // The prefix and suffix chars can surround the vpinteger region, and there can be
+    // a positive and a negative pattern.
+    //
+    // The prefix and suffix cannot be digits, # or P or V. No quoted chars in prefix either.
+    //
+     val prefixChars = """[^0-9#PV']?""" // same for suffix.
+     val beforeVChars = """#*[0-9]+"""
+     val afterVChars = """[0-9]+"""
+     //
+     // Each of the capture groups is defined here in order
+     //
+     val posPrefix = s"""($prefixChars)"""
+     val posBeforeV = s"""($beforeVChars)"""
+     val posAfterV = s"""($afterVChars)"""
+     val posSuffix = posPrefix
+     val negPrefix = posPrefix
+     val negBeforeV = posBeforeV
+     val negAfterV = posAfterV
+     val negSuffix = posPrefix
+     // don't forget the ^ and $ (start of data, end of data) because we want
+     // the match to consume all the characters, starting at the beginning.
+    val vPattern=
+     s"""^${posPrefix}${posBeforeV}V${posAfterV}${posSuffix}(?:;${negPrefix}${negBeforeV}V${negAfterV}${negSuffix})?$$"""
+    val re = vPattern.r
+    re
+  }
+
+
+
+  private[primitives] lazy val vregexZoned= {
+    // Note: for zoned, can only have a positive pattern
+    // Prefix or suffix can only be '+' character, to
+    // indicate leading or trailing sign, and can only
+    // one of those.
+    //
+    // Also we're not allowing the # character, since I think that
+    // makes no sense for zoned with virtual decimal point.
+    //
+    val prefixChars = """\+?""" // only + for prefix/suffix
+    val aroundVChars = """[0-9]+"""
+    //
+    // capture groups are defined here in sequence
+    //
+    val prefix = s"""($prefixChars)"""
+    val beforeV = s"""($aroundVChars)"""
+    val afterV = beforeV
+    val suffix = prefix
+    val vPattern = s"""^${prefix}${beforeV}V${afterV}${suffix}$$""" // only positive pattern allowed
+    val re = vPattern.r
+    re
+  }
+
+  /**
+   * Checks a pattern for suitability with V (virtual decimal point) in the pattern
+   * in the context of zoned textNumberRep.
+   *
+   * This is broken out separately for unit testing purposes.
+   *
+   * @param patternStripped the dfdl:textNumberPattern pattern - with all quoting removed.
+   * @return None if the pattern is illegal syntax for use with V (virtual decimal point)
+   *         otherwise Some(N) where N is the number of digits to the right of the V character.
+   */
+  private[primitives] def textDecimalVirtualPointForZoned(patternStripped: String): Option[Int] = {
+    val r = TextNumberPatternUtils.vregexZoned
+    r.findFirstMatchIn(patternStripped) match {
+      // note: cannot have both a prefix and suffix. Only one of them.
+      case Some(r(pre, _, afterV, suf)) if (pre.length + suf.length <= 1) => Some(afterV.length)
+      case _ => None
+    }
+  }
+
+  /**
+   * The apos character (') is the quoting character in ICU patterns (and DFDL textNumberPattern).
+   * This removes any unquoted P or V characters (which are not implemented by ICU)
+   * leaving a pattern string that is suitable for use with ICU.
+   * @param pattern the textNumberPattern string
+   * @return the pattern string with all unquoted P and unquoted V removed.
+   */
+  def removeUnquotedPV(pattern: String) : String = {
+    val uniqueQQString = "alsdflslkjskjslkkjlkkjfppooiipsldsflj"

Review Comment:
   Addressed in https://github.com/apache/daffodil/pull/910



##########
daffodil-core/src/main/scala/org/apache/daffodil/grammar/primitives/PrimitivesZoned.scala:
##########
@@ -47,71 +45,67 @@ case class ConvertZonedCombinator(e: ElementBase, value: Gram, converter: Gram)
 }
 
 case class ConvertZonedNumberPrim(e: ElementBase)
-  extends Terminal(e, true) {
-
-  val textNumberFormatEv: TextNumberFormatEv = {
-    val pattern = {
-      val p = e.textNumberPattern
+  extends Terminal(e, true)
+  with ConvertTextNumberMixin {
+
+  final override protected lazy val textDecimalVirtualPointFromPattern: Int = {
+    TextNumberPatternUtils.textDecimalVirtualPointForZoned(patternStripped).getOrElse {

Review Comment:
   Renamed in https://github.com/apache/daffodil/pull/910



##########
daffodil-core/src/main/scala/org/apache/daffodil/grammar/primitives/PrimitivesTextNumber.scala:
##########
@@ -46,8 +45,249 @@ case class ConvertTextCombinator(e: ElementBase, value: Gram, converter: Gram)
   override lazy val unparser = new ConvertTextCombinatorUnparser(e.termRuntimeData, value.unparser, converter.unparser)
 }
 
-case class ConvertTextNumberPrim(e: ElementBase)
-  extends Terminal(e, true) {
+// This is a separate object for unit testing purposes.
+private[primitives]
+object TextNumberPatternUtils {
+
+  /**
+   * A regex which matches textNumberPatterns that legally use the V
+   * (implied decimal point) character.
+   */
+   private[primitives] lazy val vregexStandard = {
+    //  DFDL v1.0 Spec says
+    //  It is a Schema Definition Error if any symbols other than "0", "1" through "9" or #
+    //  are used in the vpinteger region of the pattern.
+    //
+    // The prefix and suffix chars can surround the vpinteger region, and there can be
+    // a positive and a negative pattern.
+    //
+    // The prefix and suffix cannot be digits, # or P or V. No quoted chars in prefix either.
+    //
+     val prefixChars = """[^0-9#PV']?""" // same for suffix.
+     val beforeVChars = """#*[0-9]+"""
+     val afterVChars = """[0-9]+"""
+     //
+     // Each of the capture groups is defined here in order
+     //
+     val posPrefix = s"""($prefixChars)"""
+     val posBeforeV = s"""($beforeVChars)"""
+     val posAfterV = s"""($afterVChars)"""
+     val posSuffix = posPrefix
+     val negPrefix = posPrefix
+     val negBeforeV = posBeforeV
+     val negAfterV = posAfterV
+     val negSuffix = posPrefix
+     // don't forget the ^ and $ (start of data, end of data) because we want
+     // the match to consume all the characters, starting at the beginning.
+    val vPattern=
+     s"""^${posPrefix}${posBeforeV}V${posAfterV}${posSuffix}(?:;${negPrefix}${negBeforeV}V${negAfterV}${negSuffix})?$$"""
+    val re = vPattern.r
+    re
+  }
+
+
+
+  private[primitives] lazy val vregexZoned= {
+    // Note: for zoned, can only have a positive pattern
+    // Prefix or suffix can only be '+' character, to
+    // indicate leading or trailing sign, and can only
+    // one of those.
+    //
+    // Also we're not allowing the # character, since I think that
+    // makes no sense for zoned with virtual decimal point.
+    //
+    val prefixChars = """\+?""" // only + for prefix/suffix
+    val aroundVChars = """[0-9]+"""
+    //
+    // capture groups are defined here in sequence
+    //
+    val prefix = s"""($prefixChars)"""
+    val beforeV = s"""($aroundVChars)"""
+    val afterV = beforeV
+    val suffix = prefix
+    val vPattern = s"""^${prefix}${beforeV}V${afterV}${suffix}$$""" // only positive pattern allowed
+    val re = vPattern.r
+    re
+  }
+
+  /**
+   * Checks a pattern for suitability with V (virtual decimal point) in the pattern
+   * in the context of zoned textNumberRep.
+   *
+   * This is broken out separately for unit testing purposes.
+   *
+   * @param patternStripped the dfdl:textNumberPattern pattern - with all quoting removed.
+   * @return None if the pattern is illegal syntax for use with V (virtual decimal point)
+   *         otherwise Some(N) where N is the number of digits to the right of the V character.
+   */
+  private[primitives] def textDecimalVirtualPointForZoned(patternStripped: String): Option[Int] = {
+    val r = TextNumberPatternUtils.vregexZoned
+    r.findFirstMatchIn(patternStripped) match {
+      // note: cannot have both a prefix and suffix. Only one of them.
+      case Some(r(pre, _, afterV, suf)) if (pre.length + suf.length <= 1) => Some(afterV.length)
+      case _ => None
+    }
+  }
+
+  /**
+   * The apos character (') is the quoting character in ICU patterns (and DFDL textNumberPattern).
+   * This removes any unquoted P or V characters (which are not implemented by ICU)
+   * leaving a pattern string that is suitable for use with ICU.
+   * @param pattern the textNumberPattern string
+   * @return the pattern string with all unquoted P and unquoted V removed.
+   */
+  def removeUnquotedPV(pattern: String) : String = {
+    val uniqueQQString = "alsdflslkjskjslkkjlkkjfppooiipsldsflj"
+    // A single regex that matches an unquoted character
+    // where the quoting char is self quoting requires
+    // a zero-width look behind that matches a potentially
+    // unbounded number of quote chars. That's not allowed.
+    //
+    // Consider ''''''P. We want a regex that matches only the P here
+    // because it is preceded by an even number of quotes.
+    //
+    // I did some tests, and the formulations you think might work
+    // such as from stack-overflow, don't work.
+    // (See test howNotToUseRegexLookBehindWithReplaceAll)
+    //
+    // So we use this brute force technique of replacing all ''
+    // first, so we have only single quotes to deal with.
+    pattern.replaceAll("''", uniqueQQString).
+      replaceAll("(?<!')P", "").
+      replaceAll("(?<!')V", "").
+      replaceAll(uniqueQQString, "''")
+  }
+}
+
+trait ConvertTextNumberMixin {
+
+  def e: ElementBase
+
+  /**
+   * Convenience. The original value of textNumberPattern
+   */
+  @inline
+  final protected def pattern = e.textNumberPattern
+
+  /**
+   * Checks a pattern for suitability with V (virtual decimal point) in the pattern
+   * in the context of the corresponding textNumberRep. Computes number of digits to right of the V.
+   *
+   * SDE if pattern has a syntax error.
+   *
+   * @return the number of digits to the right of the V character. Must be 1 or greater.
+   *
+   */
+  protected def textDecimalVirtualPointFromPattern: Int
+
+  /**
+   * The pattern with escaped characters removed.
+   * This can be reliably searched for pattern characters,
+   * but cannot be used as an operational pattern given that
+   * potentially lots of things have been removed from it.
+   */
+  final protected lazy val patternStripped = {
+    // note: tick == apos == ' == single quote.
+    // First remove entirely all escaped ticks ie., ''
+    val noEscapedTicksRegex = """''""".r
+    val patternNoEscapedTicks = noEscapedTicksRegex.replaceAllIn(pattern, "")
+    // Next remove all tick-escaped characters entirely
+    val noQuotedRegex = """'[^']+'""".r
+    val patternNoQuoted = noQuotedRegex.replaceAllIn(patternNoEscapedTicks, "")
+    // the remaining string contains only pattern special characters
+    // and regular non-pattern characters
+    patternNoQuoted
+  }
+
+  protected final lazy val hasV = patternStripped.contains("V")
+  protected final lazy val hasP = patternStripped.contains("P")
+
+  /**
+   * analogous to the property dfdl:binaryDecimalVirtualPoint
+   *
+   * Value is 0 if there is no virtual decimal point.
+   * Value is the number of digits to the right of the 'V'
+   */
+  final lazy val textDecimalVirtualPoint: Int = {
+    if (!hasV && !hasP) 0 // no virtual point
+    else {
+      if (hasP) {
+        e.notYetImplemented("textNumberPattern with P symbol")
+      }
+      Assert.invariant(hasV)
+      //
+      // check for things incompatible with "V"
+      //
+      val virtualPoint = textDecimalVirtualPointFromPattern
+      Assert.invariant(virtualPoint >= 1) // if this fails the regex is broken.
+      virtualPoint
+    }
+  }
+
+  //
+  // We need 2 patterns
+  // 1. The original textNumberPattern string from the schema
+  // 2. Same but with the P or V removed.
+  //
+  // We need to use the original textNumberPattern in diagnostic messages
+  // since that's what the schema author wrote and expects to see.
+  // But if the pattern has P or V, then we need ICU at runtime to use the original
+  // but with the P or V removed since the P and V don't actually represent anything in the data.
+  final protected lazy val runtimePattern = {
+    if (hasV || hasP) {
+      val patternWithoutPV = TextNumberPatternUtils.removeUnquotedPV(pattern)
+      patternWithoutPV
+    }
+    else pattern
+  }
+
+  final protected def checkPatternWithICU(e: ElementBase) = {
+    // Load the pattern to make sure it is valid
+    try {
+      if (hasV || hasP) {
+        new DecimalFormat(runtimePattern)
+      } else {
+        new DecimalFormat(pattern)
+      }
+    } catch {
+      case ex: IllegalArgumentException =>
+        if (hasV || hasP) {
+          // we don't know what the diagnostic message will say here
+          // since it is from the ICU library.
+          // That library has only seen the pattern with the P and V
+          // removed. The messages might show that pattern which would
+          // confuse the schema author since that's not the pattern
+          // they wrote.
+          //
+          // So we try to explain....
+          e.SDE(
+            """Invalid textNumberPattern.
+              | The errors are about the pattern with the P (decimal scaling position)
+              | and V (virtual decimal point) characters removed: %s""".stripMargin, ex)
+        } else {
+          e.SDE("Invalid textNumberPattern: %s", ex)
+        }
+    }
+  }
+}
+
+case class ConvertTextStandardNumberPrim(e: ElementBase)
+  extends Terminal(e, true)
+  with ConvertTextNumberMixin {
+
+  final override protected lazy val textDecimalVirtualPointFromPattern: Int = {
+    val r = TextNumberPatternUtils.vregexStandard
+    r.findFirstMatchIn(patternStripped) match {
+      case Some(r(_, _, afterV, _, _, _, _, _)) => afterV.length
+      case None =>
+        e.SDE(
+          s"""The dfdl:textNumberPattern '%s' contains 'V' (virtual decimal point).

Review Comment:
   Resolved in https://github.com/apache/daffodil/pull/910



##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/processors/parsers/ConvertTextStandardNumberParser.scala:
##########
@@ -54,11 +56,64 @@ case class ConvertTextCombinatorParser(
   }
 }
 
-case class ConvertTextNumberParser(
+trait TextDecimalVirtualPointMixin {
+  def textDecimalVirtualPoint: Int
+
+  final protected val virtualPointScaleFactor = scala.math.pow(10.0, textDecimalVirtualPoint)

Review Comment:
   resolved in https://github.com/apache/daffodil/pull/910



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [daffodil] tuxji commented on pull request #900: Implement textNumberPattern 'V' virtual decimal point.

Posted by GitBox <gi...@apache.org>.
tuxji commented on PR #900:
URL: https://github.com/apache/daffodil/pull/900#issuecomment-1367542419

   DAFFODIL-853 says both V and P symbols need support, but this pull request adds support for only V.  Are you planning to make DAFFODIL-853 just about V and create a new issue for P, or what?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [daffodil] mbeckerle commented on a diff in pull request #900: Implement textNumberPattern 'V' virtual decimal point.

Posted by GitBox <gi...@apache.org>.
mbeckerle commented on code in PR #900:
URL: https://github.com/apache/daffodil/pull/900#discussion_r1059402696


##########
daffodil-core/src/main/scala/org/apache/daffodil/grammar/primitives/PrimitivesTextNumber.scala:
##########
@@ -46,8 +46,226 @@ case class ConvertTextCombinator(e: ElementBase, value: Gram, converter: Gram)
   override lazy val unparser = new ConvertTextCombinatorUnparser(e.termRuntimeData, value.unparser, converter.unparser)
 }
 
-case class ConvertTextNumberPrim(e: ElementBase)
-  extends Terminal(e, true) {
+// This is a separate object for unit testing purposes.
+private[primitives]
+object TextNumberPatternUtils {
+
+  /**
+   * A regex which matches textNumberPatterns that legally use the V
+   * (implied decimal point) character.
+   */
+   private[primitives] lazy val vregexStandard = {
+    //  DFDL v1.0 Spec says
+    //  It is a Schema Definition Error if any symbols other than "0", "1" through "9" or #
+    //  are used in the vpinteger region of the pattern.
+    //
+    // The prefix and suffix chars can surround the vpinteger region, and there can be
+    // a positive and a negative pattern.
+    //
+    // The prefix and suffix cannot be digits, # or P or V. No quoted chars in prefix either.
+    //
+    val prefix = """([^0-9#PV']?)"""
+    val suffix = prefix // same syntax as prefix
+    val vpinteger = """(#*[0-9]+)V([0-9]+)"""
+    val subpattern = s"""${prefix}${vpinteger}${suffix}"""
+     // don't forget the ^ and $ (start of data, end of data) because we want
+     // the match to consume all the characters, starting at the beginning.
+    val vPattern = s"""^${subpattern}(?:;${subpattern})?$$"""
+    val re = vPattern.r
+    re
+  }
+
+  def textDecimalVirtualPointForStandard(
+    e: ImplementsThrowsOrSavesSDE,
+    originalPattern: String,
+    patternStripped: String): Int = {
+    val r = TextNumberPatternUtils.vregexStandard
+    r.findFirstMatchIn(patternStripped) match {
+      case Some(r(_, _, afterV, _, _, _, _, _)) => afterV.length
+      case None =>
+        e.SDE(
+          s"""The dfdl:textNumberPattern '%s' contains 'V' (virtual decimal point).
+             | Other than the sign indicators, it can contain only
+             | '#', then digits 0-9 then 'V' then digits 0-9.
+             | The positive part of the dfdl:textNumberPattern is mandatory.""".stripMargin('|'),

Review Comment:
   That is a good style suggestion for regexs with capture groups generally. 
   
   The other groups are there in case we need/want to create finer grain diagnostics that point at the smaller parts of the overall pattern, but really I needed them to test that it was working at all. Regexs are notoriously difficult. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [daffodil] mbeckerle commented on a diff in pull request #900: Implement textNumberPattern 'V' virtual decimal point.

Posted by GitBox <gi...@apache.org>.
mbeckerle commented on code in PR #900:
URL: https://github.com/apache/daffodil/pull/900#discussion_r1059405476


##########
daffodil-lib/src/main/scala/org/apache/daffodil/util/Numbers.scala:
##########
@@ -311,4 +311,28 @@ object Numbers {
   def asAnyRef(n: Any): AnyRef = {
     n.asInstanceOf[AnyRef]
   }
+
+  /**
+   * This is the official way to tell if a JBigDecimal has an integer value.
+   * @param bd A Java BigDecimal
+   * @return true if the value is an integer (has fraction part == 0).
+   */
+  def isIntegerValue (bd: JBigDecimal) = {
+     bd.signum() == 0 || bd.scale() <= 0 || bd.stripTrailingZeros().scale() <= 0;
+  }
+
+  /** For any standard predefined subtype of JNumber, is the value numerically zero. */
+  def isZero(n1: JNumber) : Boolean = {
+    n1 match {
+      case d: JDouble => d.doubleValue() == 0.0
+      case f: JFloat => f.floatValue() == 0.0
+      case bd: JBigDecimal => bd.signum() == 0
+      case bi: JBigInt => bi.signum() == 0
+      case l: JLong => l.longValue == 0
+      case i: JInt => i.intValue == 0
+      case s: JShort => s.shortValue == 0
+      case b: JByte => b.byteValue == 0
+    }

Review Comment:
   I will add a default case, as this is supposed to handle the java numbers that we use in our internal infoset representation, which does not use any of the ones you mention here. 
   
   It was definitely a mistake for Scala to not define its own Number base class/trait. One more compromise to reuse of the Java classes. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [daffodil] mbeckerle commented on a diff in pull request #900: Implement textNumberPattern 'V' virtual decimal point.

Posted by GitBox <gi...@apache.org>.
mbeckerle commented on code in PR #900:
URL: https://github.com/apache/daffodil/pull/900#discussion_r1059401878


##########
daffodil-core/src/main/scala/org/apache/daffodil/grammar/ElementBaseGrammarMixin.scala:
##########
@@ -642,7 +642,7 @@ trait ElementBaseGrammarMixin
 
   private lazy val textConverter = {
     primType match {
-      case _: NodeInfo.Numeric.Kind => ConvertTextNumberPrim(this)
+      case _: NodeInfo.Numeric.Kind => ConvertTextNumberStandardPrim(this)

Review Comment:
   I take your point that it should be ConvertTextStandardNumberPrim. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [daffodil] mbeckerle commented on a diff in pull request #900: Implement textNumberPattern 'V' virtual decimal point.

Posted by GitBox <gi...@apache.org>.
mbeckerle commented on code in PR #900:
URL: https://github.com/apache/daffodil/pull/900#discussion_r1060864506


##########
daffodil-core/src/main/scala/org/apache/daffodil/grammar/primitives/PrimitivesTextNumber.scala:
##########
@@ -46,8 +45,249 @@ case class ConvertTextCombinator(e: ElementBase, value: Gram, converter: Gram)
   override lazy val unparser = new ConvertTextCombinatorUnparser(e.termRuntimeData, value.unparser, converter.unparser)
 }
 
-case class ConvertTextNumberPrim(e: ElementBase)
-  extends Terminal(e, true) {
+// This is a separate object for unit testing purposes.
+private[primitives]

Review Comment:
   I suggest using it when you would make something private, but want a unit test to be able to test it. 
   That's in fact the only thing I've ever seen package private used for. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [daffodil] mbeckerle commented on pull request #900: Implement textNumberPattern 'V' virtual decimal point.

Posted by GitBox <gi...@apache.org>.
mbeckerle commented on PR #900:
URL: https://github.com/apache/daffodil/pull/900#issuecomment-1364351631

   One test failing zoned_textNumberPattern_fail05
   due to change in diagnostic behavior. There is a check for "E" in the pattern, but that is being suppressed by some other error detection. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [daffodil] mbeckerle commented on a diff in pull request #900: Implement textNumberPattern 'V' virtual decimal point.

Posted by GitBox <gi...@apache.org>.
mbeckerle commented on code in PR #900:
URL: https://github.com/apache/daffodil/pull/900#discussion_r1058548333


##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/processors/parsers/PrimitivesTextNumber1.scala:
##########
@@ -78,18 +121,18 @@ case class ConvertTextNumberParser(
     // will match either all or none of 'str', never part of it. Thus,
     // findFirstIn() either matches and it's a zero rep, or it doesn't and it's
     // not a zero
-    val numValue = zeroRepsRegex.find { _.findFirstIn(str).isDefined } match {
+    val numValue: DataValueNumber = zeroRepsRegex.find { _.findFirstIn(str).isDefined } match {
       case Some(_) => primNumeric.fromNumber(0)

Review Comment:
   Also, test with smaller integer types.  Virtual decimal points should work for type xs:byte even. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [daffodil] mbeckerle commented on a diff in pull request #900: Implement textNumberPattern 'V' virtual decimal point.

Posted by GitBox <gi...@apache.org>.
mbeckerle commented on code in PR #900:
URL: https://github.com/apache/daffodil/pull/900#discussion_r1056654682


##########
daffodil-core/src/main/scala/org/apache/daffodil/grammar/primitives/PrimitivesTextNumber.scala:
##########
@@ -46,8 +46,231 @@ case class ConvertTextCombinator(e: ElementBase, value: Gram, converter: Gram)
   override lazy val unparser = new ConvertTextCombinatorUnparser(e.termRuntimeData, value.unparser, converter.unparser)
 }
 
-case class ConvertTextNumberPrim(e: ElementBase)
-  extends Terminal(e, true) {
+// This is a separate object for unit testing purposes.
+private[primitives]
+object TextNumberPatternUtils {
+
+  /**
+   * A regex which matches textNumberPatterns that legally use the V
+   * (implied decimal point) character.
+   */
+   private[primitives] lazy val vregexStandard = {
+    //  DFDL v1.0 Spec says
+    //  It is a Schema Definition Error if any symbols other than "0", "1" through "9" or #
+    //  are used in the vpinteger region of the pattern.
+    //
+    // The prefix and suffix chars can surround the vpinteger region, and there can be
+    // a positive and a negative pattern.
+    //
+    // The prefix and suffix cannot be digits, # or P or V
+    //
+    val prefix = """([^0-9#PV]?)"""
+    val suffix = prefix // same syntax as prefix
+    val vpinteger = """(#*[0-9]+)V([0-9]+)"""
+    val subpattern = s"""${prefix}${vpinteger}${suffix}"""
+     // don't forget the ^ and $ (start of data, end of data) because we want
+     // the match to consume all the characters, starting at the beginning.
+    val vPattern = s"""^${subpattern}(?:;${subpattern})?$$"""
+    val re = vPattern.r
+    re
+  }
+
+  def textDecimalVirtualPointForStandard(
+    e: ImplementsThrowsOrSavesSDE,
+    originalPattern: String,
+    patternStripped: String): Int = {
+    val r = TextNumberPatternUtils.vregexStandard
+    r.findFirstMatchIn(patternStripped) match {
+      case Some(r(_, _, afterV, _, _, _, _, _)) => afterV.length
+      case None =>

Review Comment:
   Maybe provide better breakdown of syntax to provide better diagnostics here. 
   This may help consolidate all the ad-hoc checking.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [daffodil] mbeckerle commented on a diff in pull request #900: Draft: Work in progress - Implement textNumberPattern 'V' virtual decimal point.

Posted by GitBox <gi...@apache.org>.
mbeckerle commented on code in PR #900:
URL: https://github.com/apache/daffodil/pull/900#discussion_r1055992518


##########
daffodil-core/src/main/scala/org/apache/daffodil/grammar/primitives/PrimitivesTextNumber.scala:
##########
@@ -63,36 +142,72 @@ case class ConvertTextNumberPrim(e: ElementBase)
     EntityReplacer { _.replaceForUnparse(zr) }
   }
 
-  val textNumberFormatEv: TextNumberFormatEv = {
-    val (pattern, patternStripped) = {
-      val p = e.textNumberPattern
-
-      if (p.startsWith(";")) {
-        e.SDE("Invalid textNumberPattern: The postive number pattern is mandatory")
-      }
+  private val pattern = {
+    val p = e.textNumberPattern
+    if (p.startsWith(";")) {

Review Comment:
   e.SDEUnless would be cleaner



##########
daffodil-core/src/main/scala/org/apache/daffodil/grammar/primitives/PrimitivesTextNumber.scala:
##########
@@ -63,36 +142,72 @@ case class ConvertTextNumberPrim(e: ElementBase)
     EntityReplacer { _.replaceForUnparse(zr) }
   }
 
-  val textNumberFormatEv: TextNumberFormatEv = {
-    val (pattern, patternStripped) = {
-      val p = e.textNumberPattern
-
-      if (p.startsWith(";")) {
-        e.SDE("Invalid textNumberPattern: The postive number pattern is mandatory")
-      }
+  private val pattern = {
+    val p = e.textNumberPattern
+    if (p.startsWith(";")) {
+      e.SDE("Invalid textNumberPattern: The postive number pattern is mandatory")
+    }
+    p
+  }
 
-      val noEscapedTicksRegex = """''""".r
-      val patternNoEscapedTicks = noEscapedTicksRegex.replaceAllIn(p, "")
-      val noQuotedRegex = """'[^']+'""".r
-      val patternNoQuoted = noQuotedRegex.replaceAllIn(patternNoEscapedTicks, "")
+  private val patternStripped = {
+    val noEscapedTicksRegex = """''""".r
+    val patternNoEscapedTicks = noEscapedTicksRegex.replaceAllIn(pattern, "")
+    val noQuotedRegex = """'[^']+'""".r
+    val patternNoQuoted = noQuotedRegex.replaceAllIn(patternNoEscapedTicks, "")
+    patternNoQuoted
+  }
 
-      if (patternNoQuoted.contains("V")) {
-        e.notYetImplemented("textNumberPattern with V symbol")
-      }
+  private val hasV = patternStripped.contains("V")
+  private val hasP = patternStripped.contains("P")
 
-      if (patternNoQuoted.contains("P")) {
+  private val textDecimalVirtualPoint = { // analogous to the property dfdl:binaryDecimalVirtualPoint
+    if (!hasV && !hasP) 0 // no virtual point
+    else {
+      if (hasP) {
         e.notYetImplemented("textNumberPattern with P symbol")
       }
+      Assert.invariant(hasV)
+      Assert.invariant(e.textNumberRep eq TextNumberRep.Standard)
+      //
+      // check for things incompatible with "V"
+      //
+      e.schemaDefinitionUnless(e.textStandardBase == 10,
+        "The dfdl:textNumberPattern V (virtual decimal point) requires that " +
+          "dfdl:textStandardBase is 10, but its value was %s",
+        e.textStandardBase)
+      //
+      val virtualPoint = TextNumberPatternUtils.textDecimalVirtualPointForStandard(e, patternStripped)
+
+      Assert.invariant(virtualPoint >= 1) // if this fails the regex is broken.
+      virtualPoint
+    }
+  }
+
+  private val patternWithoutPV =
+    pattern.replaceAll("P", "").replaceAll("V", "")
+
+  private val runtimePattern =
+    if (hasV || hasP) patternWithoutPV else pattern
+
+  val textNumberFormatEv: TextNumberFormatEv = {
 
       // Load the pattern to make sure it is valid
       try {
-        new DecimalFormat(p)
+        if (!hasV && !hasP)
+          new DecimalFormat(pattern) // But only if it doesn't have P nor V in it, since ICU doesn't support those
       } catch {
         case ex: IllegalArgumentException => e.SDE("Invalid textNumberPattern: " + ex.getMessage())
       }
 
-      (p, patternNoQuoted)
-    }
+      // At this point we have 2 patterns
+      // 1. The original textNumberPattern string from the schema
+      // 2. The original but with the P or V removed (since they dont correspond to anything in the data)
+      //
+      // We need to use the original textNumberPattern in diagnostic messages
+      // since that's what the schema author wrote.
+      // But if the pattern has P or V, then we need ICU at runtime to use the original
+      // but with the P or V removed since the P and V don't actually represent anything in the data.

Review Comment:
   Move this comment up to the val runtimePattern



##########
daffodil-core/src/main/scala/org/apache/daffodil/grammar/primitives/PrimitivesTextNumber.scala:
##########
@@ -63,36 +142,72 @@ case class ConvertTextNumberPrim(e: ElementBase)
     EntityReplacer { _.replaceForUnparse(zr) }
   }
 
-  val textNumberFormatEv: TextNumberFormatEv = {
-    val (pattern, patternStripped) = {
-      val p = e.textNumberPattern
-
-      if (p.startsWith(";")) {
-        e.SDE("Invalid textNumberPattern: The postive number pattern is mandatory")
-      }
+  private val pattern = {
+    val p = e.textNumberPattern
+    if (p.startsWith(";")) {
+      e.SDE("Invalid textNumberPattern: The postive number pattern is mandatory")
+    }
+    p
+  }
 
-      val noEscapedTicksRegex = """''""".r
-      val patternNoEscapedTicks = noEscapedTicksRegex.replaceAllIn(p, "")
-      val noQuotedRegex = """'[^']+'""".r
-      val patternNoQuoted = noQuotedRegex.replaceAllIn(patternNoEscapedTicks, "")
+  private val patternStripped = {
+    val noEscapedTicksRegex = """''""".r
+    val patternNoEscapedTicks = noEscapedTicksRegex.replaceAllIn(pattern, "")
+    val noQuotedRegex = """'[^']+'""".r
+    val patternNoQuoted = noQuotedRegex.replaceAllIn(patternNoEscapedTicks, "")
+    patternNoQuoted
+  }
 
-      if (patternNoQuoted.contains("V")) {
-        e.notYetImplemented("textNumberPattern with V symbol")
-      }
+  private val hasV = patternStripped.contains("V")
+  private val hasP = patternStripped.contains("P")
 
-      if (patternNoQuoted.contains("P")) {
+  private val textDecimalVirtualPoint = { // analogous to the property dfdl:binaryDecimalVirtualPoint
+    if (!hasV && !hasP) 0 // no virtual point
+    else {
+      if (hasP) {
         e.notYetImplemented("textNumberPattern with P symbol")
       }
+      Assert.invariant(hasV)
+      Assert.invariant(e.textNumberRep eq TextNumberRep.Standard)
+      //
+      // check for things incompatible with "V"
+      //
+      e.schemaDefinitionUnless(e.textStandardBase == 10,
+        "The dfdl:textNumberPattern V (virtual decimal point) requires that " +
+          "dfdl:textStandardBase is 10, but its value was %s",
+        e.textStandardBase)
+      //
+      val virtualPoint = TextNumberPatternUtils.textDecimalVirtualPointForStandard(e, patternStripped)
+
+      Assert.invariant(virtualPoint >= 1) // if this fails the regex is broken.
+      virtualPoint
+    }
+  }
+
+  private val patternWithoutPV =
+    pattern.replaceAll("P", "").replaceAll("V", "")
+
+  private val runtimePattern =
+    if (hasV || hasP) patternWithoutPV else pattern
+
+  val textNumberFormatEv: TextNumberFormatEv = {
 
       // Load the pattern to make sure it is valid
       try {
-        new DecimalFormat(p)
+        if (!hasV && !hasP)
+          new DecimalFormat(pattern) // But only if it doesn't have P nor V in it, since ICU doesn't support those

Review Comment:
   Add comment about this check being unnecessary for P, V case since regex checking of pattern allows only a tiny subset of all that ICU's patterns can do, and we're certain this will be error free. 



##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/processors/parsers/PrimitivesTextNumber1.scala:
##########
@@ -99,11 +102,30 @@ case class ConvertTextNumberParser(
 
         // Verify that what was parsed was what was passed exactly in byte count.
         // Use pos to verify all characters consumed & check for errors!
-        if (num == null) {
+        if (num1 == null) {

Review Comment:
   Move this null check above the icuNum match. 



##########
daffodil-core/src/main/scala/org/apache/daffodil/grammar/primitives/PrimitivesTextNumber.scala:
##########
@@ -63,36 +142,72 @@ case class ConvertTextNumberPrim(e: ElementBase)
     EntityReplacer { _.replaceForUnparse(zr) }
   }
 
-  val textNumberFormatEv: TextNumberFormatEv = {
-    val (pattern, patternStripped) = {
-      val p = e.textNumberPattern
-
-      if (p.startsWith(";")) {
-        e.SDE("Invalid textNumberPattern: The postive number pattern is mandatory")
-      }
+  private val pattern = {
+    val p = e.textNumberPattern
+    if (p.startsWith(";")) {
+      e.SDE("Invalid textNumberPattern: The postive number pattern is mandatory")
+    }
+    p
+  }
 
-      val noEscapedTicksRegex = """''""".r
-      val patternNoEscapedTicks = noEscapedTicksRegex.replaceAllIn(p, "")
-      val noQuotedRegex = """'[^']+'""".r
-      val patternNoQuoted = noQuotedRegex.replaceAllIn(patternNoEscapedTicks, "")
+  private val patternStripped = {

Review Comment:
   Add comments about what this is for. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [daffodil] mbeckerle commented on a diff in pull request #900: Implement textNumberPattern 'V' virtual decimal point.

Posted by GitBox <gi...@apache.org>.
mbeckerle commented on code in PR #900:
URL: https://github.com/apache/daffodil/pull/900#discussion_r1056654063


##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/processors/parsers/PrimitivesTextNumber1.scala:
##########
@@ -78,18 +121,18 @@ case class ConvertTextNumberParser(
     // will match either all or none of 'str', never part of it. Thus,
     // findFirstIn() either matches and it's a zero rep, or it doesn't and it's
     // not a zero
-    val numValue = zeroRepsRegex.find { _.findFirstIn(str).isDefined } match {
+    val numValue: DataValueNumber = zeroRepsRegex.find { _.findFirstIn(str).isDefined } match {

Review Comment:
   Test interactions with zero reps, and for 'standard' with float/double with NaN and infinity reps. 



##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/processors/parsers/PrimitivesTextNumber1.scala:
##########
@@ -54,11 +54,54 @@ case class ConvertTextCombinatorParser(
   }
 }
 
+trait TextDecimalVirtualPointMixin {
+  def textDecimalVirtualPoint: Int
+
+  final protected val divisor = scala.math.pow(10.0, textDecimalVirtualPoint)
+  final protected val multiplier = scala.math.pow(10.0, textDecimalVirtualPoint)
+
+  final protected def applyTextDecimalVirtualPointForParse(num1: Number): Number = {
+    if (textDecimalVirtualPoint == 0) num1
+    else {
+      // scale for virtual decimal point
+      val scaledNum: Number = num1 match {
+        case l: java.lang.Long => java.math.BigDecimal.valueOf(l).scaleByPowerOfTen(-textDecimalVirtualPoint)

Review Comment:
   Note asymmetry around type Long with unparser. 



##########
daffodil-lib/src/main/scala/org/apache/daffodil/util/DecimalUtils.scala:
##########
@@ -479,7 +479,10 @@ object DecimalUtils {
     decodedValue
   }
 
-  def zonedFromNumber(num: String, zonedStyle: TextZonedSignStyle, opl: OverpunchLocation.Value): String = {
+  def zonedFromNumber(

Review Comment:
   Revert this change. 



##########
daffodil-core/src/main/scala/org/apache/daffodil/grammar/primitives/PrimitivesTextNumber.scala:
##########
@@ -46,8 +46,231 @@ case class ConvertTextCombinator(e: ElementBase, value: Gram, converter: Gram)
   override lazy val unparser = new ConvertTextCombinatorUnparser(e.termRuntimeData, value.unparser, converter.unparser)
 }
 
-case class ConvertTextNumberPrim(e: ElementBase)
-  extends Terminal(e, true) {
+// This is a separate object for unit testing purposes.
+private[primitives]
+object TextNumberPatternUtils {
+
+  /**
+   * A regex which matches textNumberPatterns that legally use the V
+   * (implied decimal point) character.
+   */
+   private[primitives] lazy val vregexStandard = {
+    //  DFDL v1.0 Spec says
+    //  It is a Schema Definition Error if any symbols other than "0", "1" through "9" or #
+    //  are used in the vpinteger region of the pattern.
+    //
+    // The prefix and suffix chars can surround the vpinteger region, and there can be
+    // a positive and a negative pattern.
+    //
+    // The prefix and suffix cannot be digits, # or P or V
+    //
+    val prefix = """([^0-9#PV]?)"""
+    val suffix = prefix // same syntax as prefix
+    val vpinteger = """(#*[0-9]+)V([0-9]+)"""
+    val subpattern = s"""${prefix}${vpinteger}${suffix}"""
+     // don't forget the ^ and $ (start of data, end of data) because we want
+     // the match to consume all the characters, starting at the beginning.
+    val vPattern = s"""^${subpattern}(?:;${subpattern})?$$"""
+    val re = vPattern.r
+    re
+  }
+
+  def textDecimalVirtualPointForStandard(
+    e: ImplementsThrowsOrSavesSDE,
+    originalPattern: String,
+    patternStripped: String): Int = {
+    val r = TextNumberPatternUtils.vregexStandard
+    r.findFirstMatchIn(patternStripped) match {
+      case Some(r(_, _, afterV, _, _, _, _, _)) => afterV.length
+      case None =>

Review Comment:
   Maybe provide better breakdown of syntax to provide better diagnostics here. 
   This may help consolidate all the ad-hoc checking.



##########
daffodil-core/src/main/scala/org/apache/daffodil/grammar/primitives/PrimitivesTextNumber.scala:
##########
@@ -46,8 +46,231 @@ case class ConvertTextCombinator(e: ElementBase, value: Gram, converter: Gram)
   override lazy val unparser = new ConvertTextCombinatorUnparser(e.termRuntimeData, value.unparser, converter.unparser)
 }
 
-case class ConvertTextNumberPrim(e: ElementBase)
-  extends Terminal(e, true) {
+// This is a separate object for unit testing purposes.
+private[primitives]
+object TextNumberPatternUtils {
+
+  /**
+   * A regex which matches textNumberPatterns that legally use the V
+   * (implied decimal point) character.
+   */
+   private[primitives] lazy val vregexStandard = {
+    //  DFDL v1.0 Spec says
+    //  It is a Schema Definition Error if any symbols other than "0", "1" through "9" or #
+    //  are used in the vpinteger region of the pattern.
+    //
+    // The prefix and suffix chars can surround the vpinteger region, and there can be
+    // a positive and a negative pattern.
+    //
+    // The prefix and suffix cannot be digits, # or P or V
+    //
+    val prefix = """([^0-9#PV]?)"""
+    val suffix = prefix // same syntax as prefix
+    val vpinteger = """(#*[0-9]+)V([0-9]+)"""
+    val subpattern = s"""${prefix}${vpinteger}${suffix}"""
+     // don't forget the ^ and $ (start of data, end of data) because we want
+     // the match to consume all the characters, starting at the beginning.
+    val vPattern = s"""^${subpattern}(?:;${subpattern})?$$"""
+    val re = vPattern.r
+    re
+  }
+
+  def textDecimalVirtualPointForStandard(
+    e: ImplementsThrowsOrSavesSDE,
+    originalPattern: String,
+    patternStripped: String): Int = {
+    val r = TextNumberPatternUtils.vregexStandard
+    r.findFirstMatchIn(patternStripped) match {
+      case Some(r(_, _, afterV, _, _, _, _, _)) => afterV.length
+      case None =>
+        e.SDE(
+          s"""The dfdl:textNumberPattern '%s' contains 'V' (virtual decimal point).
+             | Other than the sign indicators, it can contain only
+             | '#', then digits 0-9 then 'V' then digits 0-9.
+             | The positive number pattern is mandatory.""".stripMargin('|'),
+          originalPattern)
+    }
+  }
+
+  private[primitives] lazy val vregexZoned= {
+    // Note: for zoned, can only have a positive pattern
+    // Prefix or suffix can only be '+' character, to
+    // indicate leading or trailing sign, and can only
+    // one of those.
+    //
+    // Also we're not allowing the # character, since I think that
+    // makes no sense for zoned with virtual decimal point.
+    //
+    val vPattern = """^(\+?)([0-9]+)V([0-9]+)(\+?)$""" // only positive pattern allowed
+    val re = vPattern.r
+    re
+  }
+
+  /**
+   * Checks a pattern for suitability with V (virtual decimal point) in the pattern
+   * in the context of zoned textNumberRep.
+   *
+   * This is broken out separately for unit testing purposes.
+   *
+   * @param pattern the dfdl:textNumberPattern pattern
+   * @return -1 if the pattern is illegal syntax for use with V (virtual decimal point)
+   *         otherwise the number of digits to the right of the V character.
+   */
+  private[primitives] def textDecimalVirtualPointForZoned(pattern: String): Option[Int] = {
+    val r = TextNumberPatternUtils.vregexZoned
+    r.findFirstMatchIn(pattern) match {
+      // note: cannot have both a prefix and suffix. Only one of them.
+      case Some(r(pre, _, afterV, suf)) if (pre.length + suf.length <= 1) => Some(afterV.length)

Review Comment:
   Better diagnostic behavior here. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [daffodil] mbeckerle commented on a diff in pull request #900: Implement textNumberPattern 'V' virtual decimal point.

Posted by GitBox <gi...@apache.org>.
mbeckerle commented on code in PR #900:
URL: https://github.com/apache/daffodil/pull/900#discussion_r1059402794


##########
daffodil-core/src/main/scala/org/apache/daffodil/grammar/primitives/PrimitivesTextNumber.scala:
##########
@@ -46,8 +46,226 @@ case class ConvertTextCombinator(e: ElementBase, value: Gram, converter: Gram)
   override lazy val unparser = new ConvertTextCombinatorUnparser(e.termRuntimeData, value.unparser, converter.unparser)
 }
 
-case class ConvertTextNumberPrim(e: ElementBase)
-  extends Terminal(e, true) {
+// This is a separate object for unit testing purposes.
+private[primitives]
+object TextNumberPatternUtils {
+
+  /**
+   * A regex which matches textNumberPatterns that legally use the V
+   * (implied decimal point) character.
+   */
+   private[primitives] lazy val vregexStandard = {
+    //  DFDL v1.0 Spec says
+    //  It is a Schema Definition Error if any symbols other than "0", "1" through "9" or #
+    //  are used in the vpinteger region of the pattern.
+    //
+    // The prefix and suffix chars can surround the vpinteger region, and there can be
+    // a positive and a negative pattern.
+    //
+    // The prefix and suffix cannot be digits, # or P or V. No quoted chars in prefix either.
+    //
+    val prefix = """([^0-9#PV']?)"""
+    val suffix = prefix // same syntax as prefix
+    val vpinteger = """(#*[0-9]+)V([0-9]+)"""
+    val subpattern = s"""${prefix}${vpinteger}${suffix}"""
+     // don't forget the ^ and $ (start of data, end of data) because we want
+     // the match to consume all the characters, starting at the beginning.
+    val vPattern = s"""^${subpattern}(?:;${subpattern})?$$"""
+    val re = vPattern.r
+    re
+  }
+
+  def textDecimalVirtualPointForStandard(
+    e: ImplementsThrowsOrSavesSDE,
+    originalPattern: String,
+    patternStripped: String): Int = {
+    val r = TextNumberPatternUtils.vregexStandard
+    r.findFirstMatchIn(patternStripped) match {
+      case Some(r(_, _, afterV, _, _, _, _, _)) => afterV.length
+      case None =>
+        e.SDE(
+          s"""The dfdl:textNumberPattern '%s' contains 'V' (virtual decimal point).
+             | Other than the sign indicators, it can contain only
+             | '#', then digits 0-9 then 'V' then digits 0-9.
+             | The positive part of the dfdl:textNumberPattern is mandatory.""".stripMargin('|'),
+          originalPattern)
+    }
+  }
+
+  private[primitives] lazy val vregexZoned= {
+    // Note: for zoned, can only have a positive pattern
+    // Prefix or suffix can only be '+' character, to
+    // indicate leading or trailing sign, and can only
+    // one of those.
+    //
+    // Also we're not allowing the # character, since I think that
+    // makes no sense for zoned with virtual decimal point.
+    //
+    val vPattern = """^(\+?)([0-9]+)V([0-9]+)(\+?)$""" // only positive pattern allowed, only + for prefix/suffix
+    val re = vPattern.r

Review Comment:
   Good suggestion. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [daffodil] mbeckerle commented on a diff in pull request #900: Implement textNumberPattern 'V' virtual decimal point.

Posted by GitBox <gi...@apache.org>.
mbeckerle commented on code in PR #900:
URL: https://github.com/apache/daffodil/pull/900#discussion_r1059401614


##########
daffodil-core/src/main/scala/org/apache/daffodil/grammar/ElementBaseGrammarMixin.scala:
##########
@@ -642,7 +642,7 @@ trait ElementBaseGrammarMixin
 
   private lazy val textConverter = {
     primType match {
-      case _: NodeInfo.Numeric.Kind => ConvertTextNumberPrim(this)
+      case _: NodeInfo.Numeric.Kind => ConvertTextNumberStandardPrim(this)

Review Comment:
   Yes, there are two kinds of textNumberRep, standard and zoned. 
   The renaming is because this class is specifically about standard numbers, and the corresponding classes about zoned have "Zoned" in their names. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [daffodil] mbeckerle commented on pull request #900: Implement textNumberPattern 'V' virtual decimal point.

Posted by GitBox <gi...@apache.org>.
mbeckerle commented on PR #900:
URL: https://github.com/apache/daffodil/pull/900#issuecomment-1379683764

   Actually I blew it creating PR 910.
   
   The PR with the additional fixes per this PR review is 
   https://github.com/apache/daffodil/pull/911


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [daffodil] mbeckerle commented on a diff in pull request #900: Implement textNumberPattern 'V' virtual decimal point.

Posted by GitBox <gi...@apache.org>.
mbeckerle commented on code in PR #900:
URL: https://github.com/apache/daffodil/pull/900#discussion_r1059404719


##########
daffodil-lib/src/main/scala/org/apache/daffodil/util/Numbers.scala:
##########
@@ -311,4 +311,28 @@ object Numbers {
   def asAnyRef(n: Any): AnyRef = {
     n.asInstanceOf[AnyRef]
   }
+
+  /**
+   * This is the official way to tell if a JBigDecimal has an integer value.
+   * @param bd A Java BigDecimal
+   * @return true if the value is an integer (has fraction part == 0).
+   */
+  def isIntegerValue (bd: JBigDecimal) = {
+     bd.signum() == 0 || bd.scale() <= 0 || bd.stripTrailingZeros().scale() <= 0;

Review Comment:
   One of the things learned in doing this V feature is that java boxed numbers are awful. They have almost nothing in common, and almost nothing works polymorphically across them. 
   This was the best thing I could find online for determining if a decimal value is actually an integer, and this is awful performance wise since stripTrailingZeros creates another BigInteger object just so you can ask about its scale. 
   
   There *should* be a better way. But finding it is not easy.  
   
   I will convert this into a comment discussing this issue that the Java numbers have almost no polymorphic operations. This was all in service of implementing the isZero() method. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [daffodil] mbeckerle commented on a diff in pull request #900: Implement textNumberPattern 'V' virtual decimal point.

Posted by GitBox <gi...@apache.org>.
mbeckerle commented on code in PR #900:
URL: https://github.com/apache/daffodil/pull/900#discussion_r1058548333


##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/processors/parsers/PrimitivesTextNumber1.scala:
##########
@@ -78,18 +121,18 @@ case class ConvertTextNumberParser(
     // will match either all or none of 'str', never part of it. Thus,
     // findFirstIn() either matches and it's a zero rep, or it doesn't and it's
     // not a zero
-    val numValue = zeroRepsRegex.find { _.findFirstIn(str).isDefined } match {
+    val numValue: DataValueNumber = zeroRepsRegex.find { _.findFirstIn(str).isDefined } match {
       case Some(_) => primNumeric.fromNumber(0)

Review Comment:
   Also, test with smaller integer types.  Virtual decimal points should work for type xs:byte even. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [daffodil] mbeckerle commented on a diff in pull request #900: Implement textNumberPattern 'V' virtual decimal point.

Posted by GitBox <gi...@apache.org>.
mbeckerle commented on code in PR #900:
URL: https://github.com/apache/daffodil/pull/900#discussion_r1058637583


##########
daffodil-core/src/main/scala/org/apache/daffodil/grammar/primitives/PrimitivesTextNumber.scala:
##########
@@ -46,6 +46,85 @@ case class ConvertTextCombinator(e: ElementBase, value: Gram, converter: Gram)
   override lazy val unparser = new ConvertTextCombinatorUnparser(e.termRuntimeData, value.unparser, converter.unparser)
 }
 
+/**
+ * A regex which matches textNumberPatterns that legally use the V
+ * (implied decimal point) character.
+ */
+object TextNumberPatternUtils {
+
+   private[primitives] lazy val vregexStandard = {
+    //  DFDL v1.0 Spec says
+    //  It is a Schema Definition Error if any symbols other than "0", "1" through "9" or #
+    //  are used in the vpinteger region of the pattern.
+    //
+    // The prefix and suffix chars can surround the vpinteger region, and there can be
+    // a positive and a negative pattern.
+    //
+    // The prefix and suffix cannot be digits, # or P or V
+    //
+    val prefix = """([^0-9#PV]?)"""

Review Comment:
   Not correct. When 'V' or 'P' are present, these ARE the only allowed special pattern characters. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [daffodil] mbeckerle commented on a diff in pull request #900: Implement textNumberPattern 'V' virtual decimal point.

Posted by GitBox <gi...@apache.org>.
mbeckerle commented on code in PR #900:
URL: https://github.com/apache/daffodil/pull/900#discussion_r1060871223


##########
daffodil-core/src/main/scala/org/apache/daffodil/grammar/primitives/PrimitivesTextNumber.scala:
##########
@@ -46,8 +45,249 @@ case class ConvertTextCombinator(e: ElementBase, value: Gram, converter: Gram)
   override lazy val unparser = new ConvertTextCombinatorUnparser(e.termRuntimeData, value.unparser, converter.unparser)
 }
 
-case class ConvertTextNumberPrim(e: ElementBase)
-  extends Terminal(e, true) {
+// This is a separate object for unit testing purposes.
+private[primitives]
+object TextNumberPatternUtils {
+
+  /**
+   * A regex which matches textNumberPatterns that legally use the V
+   * (implied decimal point) character.
+   */
+   private[primitives] lazy val vregexStandard = {
+    //  DFDL v1.0 Spec says
+    //  It is a Schema Definition Error if any symbols other than "0", "1" through "9" or #
+    //  are used in the vpinteger region of the pattern.
+    //
+    // The prefix and suffix chars can surround the vpinteger region, and there can be
+    // a positive and a negative pattern.
+    //
+    // The prefix and suffix cannot be digits, # or P or V. No quoted chars in prefix either.
+    //
+     val prefixChars = """[^0-9#PV']?""" // same for suffix.
+     val beforeVChars = """#*[0-9]+"""
+     val afterVChars = """[0-9]+"""
+     //
+     // Each of the capture groups is defined here in order
+     //
+     val posPrefix = s"""($prefixChars)"""
+     val posBeforeV = s"""($beforeVChars)"""
+     val posAfterV = s"""($afterVChars)"""
+     val posSuffix = posPrefix
+     val negPrefix = posPrefix
+     val negBeforeV = posBeforeV
+     val negAfterV = posAfterV
+     val negSuffix = posPrefix
+     // don't forget the ^ and $ (start of data, end of data) because we want
+     // the match to consume all the characters, starting at the beginning.
+    val vPattern=
+     s"""^${posPrefix}${posBeforeV}V${posAfterV}${posSuffix}(?:;${negPrefix}${negBeforeV}V${negAfterV}${negSuffix})?$$"""
+    val re = vPattern.r
+    re
+  }
+
+
+
+  private[primitives] lazy val vregexZoned= {
+    // Note: for zoned, can only have a positive pattern
+    // Prefix or suffix can only be '+' character, to
+    // indicate leading or trailing sign, and can only
+    // one of those.
+    //
+    // Also we're not allowing the # character, since I think that
+    // makes no sense for zoned with virtual decimal point.
+    //
+    val prefixChars = """\+?""" // only + for prefix/suffix
+    val aroundVChars = """[0-9]+"""
+    //
+    // capture groups are defined here in sequence
+    //
+    val prefix = s"""($prefixChars)"""
+    val beforeV = s"""($aroundVChars)"""
+    val afterV = beforeV
+    val suffix = prefix
+    val vPattern = s"""^${prefix}${beforeV}V${afterV}${suffix}$$""" // only positive pattern allowed
+    val re = vPattern.r
+    re
+  }
+
+  /**
+   * Checks a pattern for suitability with V (virtual decimal point) in the pattern
+   * in the context of zoned textNumberRep.
+   *
+   * This is broken out separately for unit testing purposes.
+   *
+   * @param patternStripped the dfdl:textNumberPattern pattern - with all quoting removed.
+   * @return None if the pattern is illegal syntax for use with V (virtual decimal point)
+   *         otherwise Some(N) where N is the number of digits to the right of the V character.
+   */
+  private[primitives] def textDecimalVirtualPointForZoned(patternStripped: String): Option[Int] = {
+    val r = TextNumberPatternUtils.vregexZoned
+    r.findFirstMatchIn(patternStripped) match {
+      // note: cannot have both a prefix and suffix. Only one of them.
+      case Some(r(pre, _, afterV, suf)) if (pre.length + suf.length <= 1) => Some(afterV.length)
+      case _ => None
+    }
+  }
+
+  /**
+   * The apos character (') is the quoting character in ICU patterns (and DFDL textNumberPattern).
+   * This removes any unquoted P or V characters (which are not implemented by ICU)
+   * leaving a pattern string that is suitable for use with ICU.
+   * @param pattern the textNumberPattern string
+   * @return the pattern string with all unquoted P and unquoted V removed.
+   */
+  def removeUnquotedPV(pattern: String) : String = {
+    val uniqueQQString = "alsdflslkjskjslkkjlkkjfppooiipsldsflj"

Review Comment:
   I suspect a split-based algorithm per your suggestion below, is better. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [daffodil] tuxji commented on a diff in pull request #900: Implement textNumberPattern 'V' virtual decimal point.

Posted by GitBox <gi...@apache.org>.
tuxji commented on code in PR #900:
URL: https://github.com/apache/daffodil/pull/900#discussion_r1059145342


##########
daffodil-core/src/main/scala/org/apache/daffodil/grammar/primitives/PrimitivesTextNumber.scala:
##########
@@ -46,8 +46,226 @@ case class ConvertTextCombinator(e: ElementBase, value: Gram, converter: Gram)
   override lazy val unparser = new ConvertTextCombinatorUnparser(e.termRuntimeData, value.unparser, converter.unparser)
 }
 
-case class ConvertTextNumberPrim(e: ElementBase)
-  extends Terminal(e, true) {
+// This is a separate object for unit testing purposes.
+private[primitives]
+object TextNumberPatternUtils {
+
+  /**
+   * A regex which matches textNumberPatterns that legally use the V
+   * (implied decimal point) character.
+   */
+   private[primitives] lazy val vregexStandard = {
+    //  DFDL v1.0 Spec says
+    //  It is a Schema Definition Error if any symbols other than "0", "1" through "9" or #
+    //  are used in the vpinteger region of the pattern.
+    //
+    // The prefix and suffix chars can surround the vpinteger region, and there can be
+    // a positive and a negative pattern.
+    //
+    // The prefix and suffix cannot be digits, # or P or V. No quoted chars in prefix either.
+    //
+    val prefix = """([^0-9#PV']?)"""
+    val suffix = prefix // same syntax as prefix
+    val vpinteger = """(#*[0-9]+)V([0-9]+)"""
+    val subpattern = s"""${prefix}${vpinteger}${suffix}"""
+     // don't forget the ^ and $ (start of data, end of data) because we want
+     // the match to consume all the characters, starting at the beginning.
+    val vPattern = s"""^${subpattern}(?:;${subpattern})?$$"""
+    val re = vPattern.r
+    re
+  }
+
+  def textDecimalVirtualPointForStandard(
+    e: ImplementsThrowsOrSavesSDE,
+    originalPattern: String,
+    patternStripped: String): Int = {
+    val r = TextNumberPatternUtils.vregexStandard
+    r.findFirstMatchIn(patternStripped) match {
+      case Some(r(_, _, afterV, _, _, _, _, _)) => afterV.length
+      case None =>
+        e.SDE(
+          s"""The dfdl:textNumberPattern '%s' contains 'V' (virtual decimal point).
+             | Other than the sign indicators, it can contain only
+             | '#', then digits 0-9 then 'V' then digits 0-9.
+             | The positive part of the dfdl:textNumberPattern is mandatory.""".stripMargin('|'),
+          originalPattern)
+    }
+  }
+
+  private[primitives] lazy val vregexZoned= {
+    // Note: for zoned, can only have a positive pattern
+    // Prefix or suffix can only be '+' character, to
+    // indicate leading or trailing sign, and can only
+    // one of those.
+    //
+    // Also we're not allowing the # character, since I think that
+    // makes no sense for zoned with virtual decimal point.
+    //
+    val vPattern = """^(\+?)([0-9]+)V([0-9]+)(\+?)$""" // only positive pattern allowed, only + for prefix/suffix
+    val re = vPattern.r

Review Comment:
   Likewise, I would define each capturing group here as val prefix, val beforeV, val afterV, val suffix, and interpolate them into vPattern at line 104.



##########
daffodil-core/src/main/scala/org/apache/daffodil/grammar/primitives/PrimitivesTextNumber.scala:
##########
@@ -46,8 +46,226 @@ case class ConvertTextCombinator(e: ElementBase, value: Gram, converter: Gram)
   override lazy val unparser = new ConvertTextCombinatorUnparser(e.termRuntimeData, value.unparser, converter.unparser)
 }
 
-case class ConvertTextNumberPrim(e: ElementBase)
-  extends Terminal(e, true) {
+// This is a separate object for unit testing purposes.
+private[primitives]
+object TextNumberPatternUtils {
+
+  /**
+   * A regex which matches textNumberPatterns that legally use the V
+   * (implied decimal point) character.
+   */
+   private[primitives] lazy val vregexStandard = {
+    //  DFDL v1.0 Spec says
+    //  It is a Schema Definition Error if any symbols other than "0", "1" through "9" or #
+    //  are used in the vpinteger region of the pattern.
+    //
+    // The prefix and suffix chars can surround the vpinteger region, and there can be
+    // a positive and a negative pattern.
+    //
+    // The prefix and suffix cannot be digits, # or P or V. No quoted chars in prefix either.
+    //
+    val prefix = """([^0-9#PV']?)"""
+    val suffix = prefix // same syntax as prefix
+    val vpinteger = """(#*[0-9]+)V([0-9]+)"""
+    val subpattern = s"""${prefix}${vpinteger}${suffix}"""
+     // don't forget the ^ and $ (start of data, end of data) because we want
+     // the match to consume all the characters, starting at the beginning.
+    val vPattern = s"""^${subpattern}(?:;${subpattern})?$$"""
+    val re = vPattern.r
+    re
+  }
+
+  def textDecimalVirtualPointForStandard(
+    e: ImplementsThrowsOrSavesSDE,
+    originalPattern: String,
+    patternStripped: String): Int = {
+    val r = TextNumberPatternUtils.vregexStandard
+    r.findFirstMatchIn(patternStripped) match {
+      case Some(r(_, _, afterV, _, _, _, _, _)) => afterV.length
+      case None =>
+        e.SDE(
+          s"""The dfdl:textNumberPattern '%s' contains 'V' (virtual decimal point).
+             | Other than the sign indicators, it can contain only
+             | '#', then digits 0-9 then 'V' then digits 0-9.
+             | The positive part of the dfdl:textNumberPattern is mandatory.""".stripMargin('|'),
+          originalPattern)
+    }
+  }
+
+  private[primitives] lazy val vregexZoned= {
+    // Note: for zoned, can only have a positive pattern
+    // Prefix or suffix can only be '+' character, to
+    // indicate leading or trailing sign, and can only
+    // one of those.
+    //
+    // Also we're not allowing the # character, since I think that
+    // makes no sense for zoned with virtual decimal point.
+    //
+    val vPattern = """^(\+?)([0-9]+)V([0-9]+)(\+?)$""" // only positive pattern allowed, only + for prefix/suffix
+    val re = vPattern.r
+    re
+  }
+
+  /**
+   * Checks a pattern for suitability with V (virtual decimal point) in the pattern
+   * in the context of zoned textNumberRep.
+   *
+   * This is broken out separately for unit testing purposes.
+   *
+   * @param pattern the dfdl:textNumberPattern pattern
+   * @return -1 if the pattern is illegal syntax for use with V (virtual decimal point)
+   *         otherwise the number of digits to the right of the V character.
+   */
+  private[primitives] def textDecimalVirtualPointForZoned(pattern: String): Option[Int] = {

Review Comment:
   Please add a similar scaladoc to textDecimalVirtualPointForStandard at line 78 too.  Also, should textDecimalVirtualPointForStandard be split into 2 overloaded functions or should the 2 overloaded textDecimalVirtualPointForZoned functions be merged together so that both textDecimalVirtualPointForStandard and textDecimalVirtualPointForZoned are defined consistently?



##########
daffodil-lib/src/main/scala/org/apache/daffodil/util/Numbers.scala:
##########
@@ -311,4 +311,28 @@ object Numbers {
   def asAnyRef(n: Any): AnyRef = {
     n.asInstanceOf[AnyRef]
   }
+
+  /**
+   * This is the official way to tell if a JBigDecimal has an integer value.
+   * @param bd A Java BigDecimal
+   * @return true if the value is an integer (has fraction part == 0).
+   */
+  def isIntegerValue (bd: JBigDecimal) = {
+     bd.signum() == 0 || bd.scale() <= 0 || bd.stripTrailingZeros().scale() <= 0;

Review Comment:
   I don't see any code in this PR calling this new function and it has a coverage warning which seems to indicate this function is not used anywhere.



##########
daffodil-lib/src/main/scala/org/apache/daffodil/util/Numbers.scala:
##########
@@ -311,4 +311,28 @@ object Numbers {
   def asAnyRef(n: Any): AnyRef = {
     n.asInstanceOf[AnyRef]
   }
+
+  /**
+   * This is the official way to tell if a JBigDecimal has an integer value.
+   * @param bd A Java BigDecimal
+   * @return true if the value is an integer (has fraction part == 0).
+   */
+  def isIntegerValue (bd: JBigDecimal) = {
+     bd.signum() == 0 || bd.scale() <= 0 || bd.stripTrailingZeros().scale() <= 0;
+  }
+
+  /** For any standard predefined subtype of JNumber, is the value numerically zero. */
+  def isZero(n1: JNumber) : Boolean = {
+    n1 match {
+      case d: JDouble => d.doubleValue() == 0.0
+      case f: JFloat => f.floatValue() == 0.0
+      case bd: JBigDecimal => bd.signum() == 0
+      case bi: JBigInt => bi.signum() == 0
+      case l: JLong => l.longValue == 0
+      case i: JInt => i.intValue == 0
+      case s: JShort => s.shortValue == 0
+      case b: JByte => b.byteValue == 0
+    }

Review Comment:
   Out of curiosity, does the Scala compiler or your IDE consider this match expression to be an exhaustive match or why didn't either of them complain about there being no default or case _ branch?  I found JNumber's javadoc page in Java 8 and it's publicly extensible (not final or sealed in any way I can tell) and has more direct subclasses than the above list ([AtomicInteger](https://docs.oracle.com/javase/8/docs/api/java/util/concurrent/atomic/AtomicInteger.html), [AtomicLong](https://docs.oracle.com/javase/8/docs/api/java/util/concurrent/atomic/AtomicLong.html), [BigDecimal](https://docs.oracle.com/javase/8/docs/api/java/math/BigDecimal.html), [BigInteger](https://docs.oracle.com/javase/8/docs/api/java/math/BigInteger.html), [Byte](https://docs.oracle.com/javase/8/docs/api/java/lang/Byte.html), [Double](https://docs.oracle.com/javase/8/docs/api/java/lang/Double.html), [DoubleAccumulator](https://docs.oracle.com/javase/8/docs/api/java/util/concurrent/atomic/DoubleAccumulator.html
 ), [DoubleAdder](https://docs.oracle.com/javase/8/docs/api/java/util/concurrent/atomic/DoubleAdder.html), [Float](https://docs.oracle.com/javase/8/docs/api/java/lang/Float.html), [Integer](https://docs.oracle.com/javase/8/docs/api/java/lang/Integer.html), [Long](https://docs.oracle.com/javase/8/docs/api/java/lang/Long.html), [LongAccumulator](https://docs.oracle.com/javase/8/docs/api/java/util/concurrent/atomic/LongAccumulator.html), [LongAdder](https://docs.oracle.com/javase/8/docs/api/java/util/concurrent/atomic/LongAdder.html), [Short](https://docs.oracle.com/javase/8/docs/api/java/lang/Short.html)).



##########
daffodil-core/src/main/scala/org/apache/daffodil/grammar/ElementBaseGrammarMixin.scala:
##########
@@ -642,7 +642,7 @@ trait ElementBaseGrammarMixin
 
   private lazy val textConverter = {
     primType match {
-      case _: NodeInfo.Numeric.Kind => ConvertTextNumberPrim(this)
+      case _: NodeInfo.Numeric.Kind => ConvertTextNumberStandardPrim(this)

Review Comment:
   What is the reason for renaming ConvertTextNumberPrim to ConvertTextNumberStandardPrim?  That is, what does the "Standard" word mean in this context?  I notice "Standard" occurs in a lot of places within the codebase: 
   
   - java.nio.charset.StandardCharsets
   - java.nio.file.StandardOpenOption
   - org.apache.daffodil.processors.charset.StandardBitsCharsets
   - requireTextStandardBaseProperty
   - textStandardBase="10"
   - textStandardGroupingSeparator=","
   - textStandardDecimalSeparator="."
   - textStandardExponentRep="E"
   - textStandardExponentCharacter (deprecated)
   - textStandardInfinityRep="Inf"
   - textStandardNaNRep="NaN"
   - textStandardZeroRep=""
   - TextStandardBaseMixin
   - AsciiStandard
   (This wasn't a complete list, by the way.)
   
   If the intention is to say the converter uses these textStandard* properties, should the new name be ConvertTextStandardNumberPrim instead?
   



##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/processors/parsers/PrimitivesTextNumber1.scala:
##########
@@ -54,11 +56,58 @@ case class ConvertTextCombinatorParser(
   }
 }
 
+trait TextDecimalVirtualPointMixin {
+  def textDecimalVirtualPoint: Int
+
+  final protected val divisor = scala.math.pow(10.0, textDecimalVirtualPoint)
+  final protected val multiplier = scala.math.pow(10.0, textDecimalVirtualPoint)

Review Comment:
   I'm confused, how come you are defining both divisor and multiplier as the same value?  Please add a comment explaining why.



##########
daffodil-core/src/main/scala/org/apache/daffodil/grammar/primitives/PrimitivesZoned.scala:
##########
@@ -47,71 +45,61 @@ case class ConvertZonedCombinator(e: ElementBase, value: Gram, converter: Gram)
 }
 
 case class ConvertZonedNumberPrim(e: ElementBase)
-  extends Terminal(e, true) {
+  extends Terminal(e, true)
+  with ConvertTextNumberMixin {
 
-  val textNumberFormatEv: TextNumberFormatEv = {
-    val pattern = {
-      val p = e.textNumberPattern
+  final override protected val pattern =  e.textNumberPattern
 
-      val noEscapedTicksRegex = """''""".r
-      val patternNoEscapedTicks = noEscapedTicksRegex.replaceAllIn(p, "")
-      val noQuotedRegex = """'[^']+'""".r
-      val patternNoQuoted = noQuotedRegex.replaceAllIn(patternNoEscapedTicks, "")
-      val zonedPatternRegex = "\\+?#*[0-9]+\\+?".r
+  lazy val textDecimalVirtualPoint = computeTextDecimalVirtualPoint(e)
 
-      if (patternNoQuoted.contains("V")) {
-        e.notYetImplemented("textNumberPattern with V symbol")
-      }
+  lazy val textNumberFormatEv: TextNumberFormatEv = {
 
-      if (patternNoQuoted.contains("P")) {
-        e.notYetImplemented("textNumberPattern with P symbol")
-      }
+    if (patternStripped.contains("@")) {
+      e.SDE("The '@' symbol may not be used in textNumberPattern for textNumberRep='zoned'")
+    }
 
-      if (patternNoQuoted.contains("@")) {
-        e.SDE("The '@' symbol may not be used in textNumberPattern for textNumberRep='zoned'")
-      }
+    if (patternStripped.contains("E")) {
+      e.SDE("The 'E' symbol may not be used in textNumberPattern for textNumberRep='zoned'")
+    }
 
-      if (patternNoQuoted.contains(";")) {
-        e.SDE("Negative patterns may not be used in textNumberPattern for textNumberRep='zoned'")
-      }
+    e.schemaDefinitionWhen(patternStripped.contains(";"),
+      "Negative patterns may not be used in textNumberPattern for textNumberRep='zoned'")
 
-      e.primType match {
-        case PrimType.Double | PrimType.Float => e.SDE("textZonedFormat='zoned' does not support Doubles/Floats")
-        case PrimType.UnsignedLong | PrimType.UnsignedInt | PrimType.UnsignedShort | PrimType.UnsignedByte => {
-          if (e.textNumberCheckPolicy == TextNumberCheckPolicy.Lax) {
-            if ((patternNoQuoted(0) != '+') && (patternNoQuoted(patternNoQuoted.length - 1) != '+'))
-              e.SDE("textNumberPattern must have '+' at the beginning or the end of the pattern when textZonedFormat='zoned' and textNumberPolicy='lax' for unsigned numbers")
-          }
-        }
-        case _ => {
-          if ((patternNoQuoted(0) != '+') && (patternNoQuoted(patternNoQuoted.length - 1) != '+'))
-            e.SDE("textNumberPattern must have '+' at the beginning or the end of the pattern when textZonedFormat='zoned' for signed numbers")
+    e.primType match {
+      case PrimType.Double | PrimType.Float => e.SDE("textNumberRep='zoned' does not support Doubles/Floats")
+      case PrimType.UnsignedLong | PrimType.UnsignedInt | PrimType.UnsignedShort | PrimType.UnsignedByte => {
+        if (e.textNumberCheckPolicy == TextNumberCheckPolicy.Lax) {
+          if ((patternStripped(0) != '+') && (patternStripped(patternStripped.length - 1) != '+'))
+            e.SDE("textNumberPattern must have '+' at the beginning or the end of the pattern when textNumberRep='zoned' and textNumberPolicy='lax' for unsigned numbers")
         }
       }
+      case _ => {
+        if ((patternStripped(0) != '+') && (patternStripped(patternStripped.length - 1) != '+'))
+          e.SDE("textNumberPattern must have '+' at the beginning or the end of the pattern when textNumberRep='zoned' for signed numbers")
+      }
+    }
 
-      if ((patternNoQuoted(0) == '+') && (patternNoQuoted(patternNoQuoted.length - 1) == '+'))
-        e.SDE("The textNumberPattern may either begin or end with a '+', not both.")
-
-      if (!zonedPatternRegex.pattern.matcher(patternNoQuoted).matches)
-        e.SDE("Invalid characters used in textNubmerPattern for zoned number. Only the following characters may be used with zoned numbers: '+', 'V', 'P', '0-9', and '#'")
+    if ((patternStripped(0) == '+') && (patternStripped(patternStripped.length - 1) == '+'))
+      e.SDE("The textNumberPattern may either begin or end with a '+', not both.")
 
-      // Load the pattern to make sure it is valid
-      try {
-        new DecimalFormat(p.replace("+", ""))
-      } catch {
-        case ex: IllegalArgumentException => e.SDE("Invalid textNumberPattern: " + ex.getMessage())
+    if (textDecimalVirtualPoint > 0) {
+      e.primType match {
+        case PrimType.Decimal => // ok
+        case _ => e.SDE(
+          """The dfdl:textNumberPattern has a virtual decimal point 'V' and dfdl:textNumberRep='zoned'.
+            | The type must be xs:decimal, but was: %s.""".stripMargin, e.primType.globalQName.toPrettyString)
       }
-
-      p
     }
 
+    checkPatternWithICU(e)
+
     /* Need to remove the '+' from the number pattern as '+' is only
     *  used to indicate which digit of the number is overpunched when
     *  dealing with zoned decimal formats. If the '+' is not removed
     *  DecimalFormat will attempt to use it as an indicator for exponent
     *  numbers, which will most likely not match the zoned number being
     *  parsed */
-    val zonedPattern = pattern.replace("+", "")
+    val zonedPattern = runtimePattern.replace("+", "")

Review Comment:
   This comment mentions DecimalFormat but we've removed import DecimalFormat from this file which is confusing.  It might be better to say we're removing '+' to make ConvertZonedNumberParser's job easier (we construct textNumberFormatEv with zonedPattern, pass textNumberFormatEv to ConvertZonedNumberParser, and call textNumberFormatEv.evaluate(start) there).



##########
daffodil-core/src/main/scala/org/apache/daffodil/grammar/primitives/PrimitivesTextNumber.scala:
##########
@@ -46,8 +46,226 @@ case class ConvertTextCombinator(e: ElementBase, value: Gram, converter: Gram)
   override lazy val unparser = new ConvertTextCombinatorUnparser(e.termRuntimeData, value.unparser, converter.unparser)
 }
 
-case class ConvertTextNumberPrim(e: ElementBase)
-  extends Terminal(e, true) {
+// This is a separate object for unit testing purposes.
+private[primitives]
+object TextNumberPatternUtils {
+
+  /**
+   * A regex which matches textNumberPatterns that legally use the V
+   * (implied decimal point) character.
+   */
+   private[primitives] lazy val vregexStandard = {
+    //  DFDL v1.0 Spec says
+    //  It is a Schema Definition Error if any symbols other than "0", "1" through "9" or #
+    //  are used in the vpinteger region of the pattern.
+    //
+    // The prefix and suffix chars can surround the vpinteger region, and there can be
+    // a positive and a negative pattern.
+    //
+    // The prefix and suffix cannot be digits, # or P or V. No quoted chars in prefix either.
+    //
+    val prefix = """([^0-9#PV']?)"""
+    val suffix = prefix // same syntax as prefix
+    val vpinteger = """(#*[0-9]+)V([0-9]+)"""
+    val subpattern = s"""${prefix}${vpinteger}${suffix}"""
+     // don't forget the ^ and $ (start of data, end of data) because we want
+     // the match to consume all the characters, starting at the beginning.
+    val vPattern = s"""^${subpattern}(?:;${subpattern})?$$"""
+    val re = vPattern.r
+    re
+  }
+
+  def textDecimalVirtualPointForStandard(
+    e: ImplementsThrowsOrSavesSDE,
+    originalPattern: String,
+    patternStripped: String): Int = {
+    val r = TextNumberPatternUtils.vregexStandard
+    r.findFirstMatchIn(patternStripped) match {
+      case Some(r(_, _, afterV, _, _, _, _, _)) => afterV.length
+      case None =>
+        e.SDE(
+          s"""The dfdl:textNumberPattern '%s' contains 'V' (virtual decimal point).
+             | Other than the sign indicators, it can contain only
+             | '#', then digits 0-9 then 'V' then digits 0-9.
+             | The positive part of the dfdl:textNumberPattern is mandatory.""".stripMargin('|'),
+          originalPattern)
+    }
+  }
+
+  private[primitives] lazy val vregexZoned= {
+    // Note: for zoned, can only have a positive pattern
+    // Prefix or suffix can only be '+' character, to
+    // indicate leading or trailing sign, and can only
+    // one of those.
+    //
+    // Also we're not allowing the # character, since I think that
+    // makes no sense for zoned with virtual decimal point.
+    //
+    val vPattern = """^(\+?)([0-9]+)V([0-9]+)(\+?)$""" // only positive pattern allowed, only + for prefix/suffix
+    val re = vPattern.r
+    re
+  }
+
+  /**
+   * Checks a pattern for suitability with V (virtual decimal point) in the pattern
+   * in the context of zoned textNumberRep.
+   *
+   * This is broken out separately for unit testing purposes.
+   *
+   * @param pattern the dfdl:textNumberPattern pattern
+   * @return -1 if the pattern is illegal syntax for use with V (virtual decimal point)
+   *         otherwise the number of digits to the right of the V character.
+   */
+  private[primitives] def textDecimalVirtualPointForZoned(pattern: String): Option[Int] = {
+    val r = TextNumberPatternUtils.vregexZoned
+    r.findFirstMatchIn(pattern) match {
+      // note: cannot have both a prefix and suffix. Only one of them.
+      case Some(r(pre, _, afterV, suf)) if (pre.length + suf.length <= 1) => Some(afterV.length)
+      case _ => None
+    }
+  }
+
+  def textDecimalVirtualPointForZoned(
+    e: ImplementsThrowsOrSavesSDE,
+    originalPattern: String,
+    patternStripped: String): Int = {
+    textDecimalVirtualPointForZoned(patternStripped).getOrElse {
+      e.SDE(
+        s"""The dfdl:textNumberPattern '%s' contains 'V' (virtual decimal point).
+           |Other than the leading or trailing '+' sign indicator,
+           |it can contain only digits 0-9.""".stripMargin('|'),
+        originalPattern)
+    }
+  }
+
+  /**
+   * The apos character (') is the quoting character in ICU patterns (and DFDL textNumberPattern).
+   * This removes any unquoted P or V characters (which are not implemented by ICU)
+   * leaving a pattern string that is suitable for use with ICU.
+   * @param pattern the textNumberPattern string
+   * @return the pattern string with all unquoted P and unquoted V removed.
+   */
+  def removeUnquotedPV(pattern: String) : String = {
+    val uniqueQQString = "alsdflslkjskjslkkjlkkjfppooiipsldsflj"
+    // A single regex that matches an unquoted character
+    // where the quoting char is self quoting requires
+    // a zero-width look behind that matches a potentially
+    // unbounded number of quote chars. That's not allowed.
+    //
+    // Consider ''''''P. We want a regex that matches only the P here
+    // because it is preceded by an even number of quotes.
+    //
+    // I did some tests, and the formulations you think might work
+    // such as from stack-overflow, don't work.
+    // (See test howNotToUseRegexLookBehindWithReplaceAll)
+    //
+    // So we use this brute force technique of replacing all ''
+    // first, so we have only single quotes to deal with.
+    pattern.replaceAll("''", uniqueQQString).
+      replaceAll("(?<!')P", "").
+      replaceAll("(?<!')V", "").
+      replaceAll(uniqueQQString, "''")
+  }
+}
+
+trait ConvertTextNumberMixin {
+
+  protected def pattern: String
+
+
+  // The pattern with escaped characters removed.
+  // This can be reliably searched for pattern characters,
+  // but cannot be used as an operational pattern given that
+  // potentially lots of things have been removed from it.
+  final protected lazy val patternStripped = {
+    // note: tick == apos == ' == single quote.
+    // First remove entirely all escaped ticks ie., ''
+    val noEscapedTicksRegex = """''""".r
+    val patternNoEscapedTicks = noEscapedTicksRegex.replaceAllIn(pattern, "")
+    // Next remove all tick-escaped characters entirely
+    val noQuotedRegex = """'[^']+'""".r
+    val patternNoQuoted = noQuotedRegex.replaceAllIn(patternNoEscapedTicks, "")
+    // the remaining string contains only pattern special characters
+    // and regular non-pattern characters
+    patternNoQuoted
+  }
+
+  protected final lazy val hasV = patternStripped.contains("V")
+  protected final lazy val hasP = patternStripped.contains("P")
+
+  // analogous to the property dfdl:binaryDecimalVirtualPoint
+  protected final def computeTextDecimalVirtualPoint(e: ElementBase) = {
+    if (!hasV && !hasP) 0 // no virtual point
+    else {
+      if (hasP) {
+        e.notYetImplemented("textNumberPattern with P symbol")
+      }
+      Assert.invariant(hasV)
+      //
+      // check for things incompatible with "V"
+      //
+      val virtualPoint = e.textNumberRep match {
+        case TextNumberRep.Standard =>
+          TextNumberPatternUtils.textDecimalVirtualPointForStandard(e, pattern, patternStripped)
+        case TextNumberRep.Zoned =>
+          TextNumberPatternUtils.textDecimalVirtualPointForZoned(e, pattern, patternStripped)
+      }
+
+      Assert.invariant(virtualPoint >= 1) // if this fails the regex is broken.
+      virtualPoint
+    }
+  }
+
+  //
+  // We need 2 patterns
+  // 1. The original textNumberPattern string from the schema
+  // 2. Same but with the P or V removed.
+  //
+  // We need to use the original textNumberPattern in diagnostic messages
+  // since that's what the schema author wrote and expects to see.
+  // But if the pattern has P or V, then we need ICU at runtime to use the original
+  // but with the P or V removed since the P and V don't actually represent anything in the data.
+  final protected lazy val runtimePattern = {
+    if (hasV || hasP) {
+      val patternWithoutPV = TextNumberPatternUtils.removeUnquotedPV(pattern)
+      patternWithoutPV
+    }
+    else pattern
+  }
+
+  final protected def checkPatternWithICU(e: ElementBase) = {
+    // Load the pattern to make sure it is valid
+    try {
+      if (hasV || hasP) {
+        new DecimalFormat(runtimePattern)
+      } else {
+        new DecimalFormat(pattern)
+      }
+    } catch {
+      case ex: IllegalArgumentException =>
+        if (hasV || hasP) {
+          // we don't know what the diagnostic message will say here
+          // since it is from the ICU library.
+          // That library has only seen the pattern with the P and V
+          // removed. The messages might show that pattern which would
+          // confuse the schema author since that's not the pattern
+          // they wrote.
+          //
+          // So we try to explain....
+          e.SDE(
+            """Invalid textNumberPattern.
+              | The errors are about the pattern with the P (decimal scaling position)
+              | and V (virtual decimal point) characters removed: %s""".stripMargin, ex)

Review Comment:
   Consider adding a test with a deliberately bad "P" pattern which ICU rejects to cover this if branch, although it's only a weak "consider" in order to turn the CI's coverage check green.



##########
daffodil-core/src/main/scala/org/apache/daffodil/grammar/primitives/PrimitivesTextNumber.scala:
##########
@@ -46,8 +46,226 @@ case class ConvertTextCombinator(e: ElementBase, value: Gram, converter: Gram)
   override lazy val unparser = new ConvertTextCombinatorUnparser(e.termRuntimeData, value.unparser, converter.unparser)
 }
 
-case class ConvertTextNumberPrim(e: ElementBase)
-  extends Terminal(e, true) {
+// This is a separate object for unit testing purposes.
+private[primitives]
+object TextNumberPatternUtils {
+
+  /**
+   * A regex which matches textNumberPatterns that legally use the V
+   * (implied decimal point) character.
+   */
+   private[primitives] lazy val vregexStandard = {
+    //  DFDL v1.0 Spec says
+    //  It is a Schema Definition Error if any symbols other than "0", "1" through "9" or #
+    //  are used in the vpinteger region of the pattern.
+    //
+    // The prefix and suffix chars can surround the vpinteger region, and there can be
+    // a positive and a negative pattern.
+    //
+    // The prefix and suffix cannot be digits, # or P or V. No quoted chars in prefix either.
+    //
+    val prefix = """([^0-9#PV']?)"""
+    val suffix = prefix // same syntax as prefix
+    val vpinteger = """(#*[0-9]+)V([0-9]+)"""
+    val subpattern = s"""${prefix}${vpinteger}${suffix}"""
+     // don't forget the ^ and $ (start of data, end of data) because we want
+     // the match to consume all the characters, starting at the beginning.
+    val vPattern = s"""^${subpattern}(?:;${subpattern})?$$"""
+    val re = vPattern.r
+    re
+  }
+
+  def textDecimalVirtualPointForStandard(
+    e: ImplementsThrowsOrSavesSDE,
+    originalPattern: String,
+    patternStripped: String): Int = {
+    val r = TextNumberPatternUtils.vregexStandard
+    r.findFirstMatchIn(patternStripped) match {
+      case Some(r(_, _, afterV, _, _, _, _, _)) => afterV.length
+      case None =>
+        e.SDE(
+          s"""The dfdl:textNumberPattern '%s' contains 'V' (virtual decimal point).
+             | Other than the sign indicators, it can contain only
+             | '#', then digits 0-9 then 'V' then digits 0-9.
+             | The positive part of the dfdl:textNumberPattern is mandatory.""".stripMargin('|'),

Review Comment:
   We use only one of vregexStandard's capturing groups here, `afterV.length`.  The only place we use the rest of the capturing groups is in TestPrimitive test cases to verify that each group captures text as expected.  Do we really need the rest of the capturing groups, that is, how high is the likelihood that Daffodil's package "primitives" will ever use them and need to know what they contain?
   
   I'm not suggesting that you name the other groups here since IDEs would complain about unused names.  Rather, name all the capturing groups at lines 67-73 as val prefix = """(group1)"", val beforeV = """(group2)""", all the way to val negSuffix, and finally interpolate all the capturing groups into the entire pattern as val vPattern = s"""^$prefix${beforeV}V$afterV$suffix(?:;$negPrefix${negBeforeV}V$negAfterV$negSuffix)?$$""".  Then it will be clear what the names of each capturing group should be in any calling code's Some(r(prefix, beforeV, afterV, suffix, negPrefix, negBeforeV, negAfterV, negSuffix)) pattern patch.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [daffodil] mbeckerle commented on pull request #900: Draft: Work in progress - Implement textNumberPattern 'V' virtual decimal point.

Posted by GitBox <gi...@apache.org>.
mbeckerle commented on PR #900:
URL: https://github.com/apache/daffodil/pull/900#issuecomment-1363551336

   This has changed only for textNumberRep 'standard' now, I want to debug that first, then apply changes to 'zoned' and refactor common stuff. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [daffodil] mbeckerle commented on a diff in pull request #900: Draft: Work in progress - Implement textNumberPattern 'V' virtual decimal point.

Posted by GitBox <gi...@apache.org>.
mbeckerle commented on code in PR #900:
URL: https://github.com/apache/daffodil/pull/900#discussion_r1056001636


##########
daffodil-core/src/test/scala/org/apache/daffodil/grammar/primitives/TestPrimitives.scala:
##########
@@ -0,0 +1,76 @@
+package org.apache.daffodil.grammar.primitives

Review Comment:
   Missing banner. Fails rat check. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [daffodil] mbeckerle commented on a diff in pull request #900: Implement textNumberPattern 'V' virtual decimal point.

Posted by GitBox <gi...@apache.org>.
mbeckerle commented on code in PR #900:
URL: https://github.com/apache/daffodil/pull/900#discussion_r1060866542


##########
daffodil-core/src/main/scala/org/apache/daffodil/grammar/primitives/PrimitivesTextNumber.scala:
##########
@@ -46,8 +45,249 @@ case class ConvertTextCombinator(e: ElementBase, value: Gram, converter: Gram)
   override lazy val unparser = new ConvertTextCombinatorUnparser(e.termRuntimeData, value.unparser, converter.unparser)
 }
 
-case class ConvertTextNumberPrim(e: ElementBase)
-  extends Terminal(e, true) {
+// This is a separate object for unit testing purposes.
+private[primitives]
+object TextNumberPatternUtils {
+
+  /**
+   * A regex which matches textNumberPatterns that legally use the V
+   * (implied decimal point) character.
+   */
+   private[primitives] lazy val vregexStandard = {
+    //  DFDL v1.0 Spec says
+    //  It is a Schema Definition Error if any symbols other than "0", "1" through "9" or #
+    //  are used in the vpinteger region of the pattern.
+    //
+    // The prefix and suffix chars can surround the vpinteger region, and there can be
+    // a positive and a negative pattern.
+    //
+    // The prefix and suffix cannot be digits, # or P or V. No quoted chars in prefix either.
+    //
+     val prefixChars = """[^0-9#PV']?""" // same for suffix.
+     val beforeVChars = """#*[0-9]+"""
+     val afterVChars = """[0-9]+"""
+     //
+     // Each of the capture groups is defined here in order
+     //
+     val posPrefix = s"""($prefixChars)"""
+     val posBeforeV = s"""($beforeVChars)"""
+     val posAfterV = s"""($afterVChars)"""
+     val posSuffix = posPrefix
+     val negPrefix = posPrefix
+     val negBeforeV = posBeforeV
+     val negAfterV = posAfterV
+     val negSuffix = posPrefix
+     // don't forget the ^ and $ (start of data, end of data) because we want
+     // the match to consume all the characters, starting at the beginning.
+    val vPattern=
+     s"""^${posPrefix}${posBeforeV}V${posAfterV}${posSuffix}(?:;${negPrefix}${negBeforeV}V${negAfterV}${negSuffix})?$$"""
+    val re = vPattern.r
+    re
+  }
+
+
+
+  private[primitives] lazy val vregexZoned= {
+    // Note: for zoned, can only have a positive pattern
+    // Prefix or suffix can only be '+' character, to
+    // indicate leading or trailing sign, and can only
+    // one of those.
+    //
+    // Also we're not allowing the # character, since I think that
+    // makes no sense for zoned with virtual decimal point.
+    //
+    val prefixChars = """\+?""" // only + for prefix/suffix
+    val aroundVChars = """[0-9]+"""
+    //
+    // capture groups are defined here in sequence
+    //
+    val prefix = s"""($prefixChars)"""
+    val beforeV = s"""($aroundVChars)"""
+    val afterV = beforeV
+    val suffix = prefix
+    val vPattern = s"""^${prefix}${beforeV}V${afterV}${suffix}$$""" // only positive pattern allowed
+    val re = vPattern.r
+    re
+  }
+
+  /**
+   * Checks a pattern for suitability with V (virtual decimal point) in the pattern
+   * in the context of zoned textNumberRep.
+   *
+   * This is broken out separately for unit testing purposes.

Review Comment:
   If this was a method of the class it naturally falls in (responsibility-wise), then it would not be visible for convenient unit testing without having to construct a bunch of schema-compiler infrastructure objects or some complex test rig. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [daffodil] mbeckerle commented on a diff in pull request #900: Implement textNumberPattern 'V' virtual decimal point.

Posted by GitBox <gi...@apache.org>.
mbeckerle commented on code in PR #900:
URL: https://github.com/apache/daffodil/pull/900#discussion_r1060863855


##########
daffodil-core/src/main/scala/org/apache/daffodil/grammar/primitives/PrimitivesTextNumber.scala:
##########
@@ -46,8 +45,249 @@ case class ConvertTextCombinator(e: ElementBase, value: Gram, converter: Gram)
   override lazy val unparser = new ConvertTextCombinatorUnparser(e.termRuntimeData, value.unparser, converter.unparser)
 }
 
-case class ConvertTextNumberPrim(e: ElementBase)
-  extends Terminal(e, true) {
+// This is a separate object for unit testing purposes.

Review Comment:
   I should have made this scaladoc.
   I should have said "This object exists to expose some code for convenient unit testing that would otherwise be inaccessible."  It's not that the whole object can move into src/test. It's just there so we can do a bunch of unit tests on the regexs it contains. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org