You are viewing a plain text version of this content. The canonical link for it is here.
Posted to j-dev@xerces.apache.org by Nick Brachet <ni...@edocs.com> on 2002/04/21 03:59:47 UTC

[PATCH] Consistent Locator for CDATA, comments & PI

The following is a patch against Xerces-J_2_0_1 to ensure the information
returned by Locator for CDATA, comments & PI events is consistent with the
other events.

I understand that Locator represents only an approximation but this patch
makes it consistent between all events, in particular characters().
For most events the Locator "points to the end of the node", except for
CDATA, comments & PI where the Locator may point a few columns ahead. This
is effectively due to the fact that
org.apache.xerces.impl.XMLEntityManager#scanData(String, XMLString) reads up
to _and including_ the delimiter. Because the delimiter is consumed the
Locator object points after it. This patch changes this behavior so
scanData() no longer consumes the delimiter.
There's also a fix to properly handle delimiters that are longer than 2
chars. This simplifies
org.apache.xerces.impl.XMLDocumentFragmentScannerImpl#scanCDATASection(boole
an) which can simply scan for "]]>" -- unless I'm missing something...

-+-+-+-+-+-+-+-+-+-+
Index: java/src/org/apache/xerces/impl/XMLDocumentFragmentScannerImpl.java
===================================================================
RCS file:
/home/cvspublic/xml-xerces/java/src/org/apache/xerces/impl/XMLDocumentFragme
ntScannerImpl.java,v
retrieving revision 1.10
diff -u -w -c -r1.10 XMLDocumentFragmentScannerImpl.java
*** java/src/org/apache/xerces/impl/XMLDocumentFragmentScannerImpl.java 29
Jan 2002 03:44:36 -0000 1.10
--- java/src/org/apache/xerces/impl/XMLDocumentFragmentScannerImpl.java 21
Apr 2002 00:59:36 -0000
***************
*** 892,921 ****
          }
  
          while (true) {
!             if (!fEntityScanner.scanData("]]", fString)) {
                  if (fDocumentHandler != null && fString.length > 0) {
                      fDocumentHandler.characters(fString, null);
                  }
!                 int brackets = 2;
!                 while (fEntityScanner.skipChar(']')) {
!                     brackets++;
!                 }
!                 if (fDocumentHandler != null && brackets > 2) {
!                     fStringBuffer.clear();
!                     for (int i = 2; i < brackets; i++) {
!                         fStringBuffer.append(']');
!                     }
!                     fDocumentHandler.characters(fStringBuffer, null);
!                 }
!                 if (fEntityScanner.skipChar('>')) {
                      break;
                  }
-                 if (fDocumentHandler != null) {
-                     fStringBuffer.clear();
-                     fStringBuffer.append("]]");
-                     fDocumentHandler.characters(fStringBuffer, null);
-                 }
-             }
              else {
                  if (fDocumentHandler != null) {
                      fDocumentHandler.characters(fString, null);
--- 892,904 ----
          }
  
          while (true) {
!             if (!fEntityScanner.scanData("]]>", fString)) {
                  if (fDocumentHandler != null && fString.length > 0) {
                      fDocumentHandler.characters(fString, null);
                  }
!                 fEntityScanner.skipString("]]>");
                  break;
              }
              else {
                  if (fDocumentHandler != null) {
                      fDocumentHandler.characters(fString, null);
Index: java/src/org/apache/xerces/impl/XMLEntityManager.java
===================================================================
RCS file:
/home/cvspublic/xml-xerces/java/src/org/apache/xerces/impl/XMLEntityManager.
java,v
retrieving revision 1.25
diff -u -w -c -r1.25 XMLEntityManager.java
*** java/src/org/apache/xerces/impl/XMLEntityManager.java 31 Jan 2002
15:17:56 -0000 1.25
--- java/src/org/apache/xerces/impl/XMLEntityManager.java 21 Apr 2002
00:59:38 -0000
***************
*** 2582,2588 ****
           * Scans a range of character data up to the specicied delimiter,
           * setting the fields of the XMLString structure, appropriately.
           * <p>
!          * <strong>Note:</strong> The characters are consumed.
           * <p>
           * <strong>Note:</strong> This assumes that the internal buffer is
           * at least the same size, or bigger, than the length of the
delimiter
--- 2582,2589 ----
           * Scans a range of character data up to the specicied delimiter,
           * setting the fields of the XMLString structure, appropriately.
           * <p>
!          * <strong>Note:</strong> The characters (up to but not including
!          * the delimiter) are consumed.
           * <p>
           * <strong>Note:</strong> This assumes that the internal buffer is
           * at least the same size, or bigger, than the length of the
delimiter
***************
*** 2728,2734 ****
                          }
                          c = fCurrentEntity.ch[fCurrentEntity.position++];
                          if (delimiter.charAt(i) != c) {
!                             fCurrentEntity.position--;
                              break;
                          }
                      }
--- 2729,2735 ----
                          }
                          c = fCurrentEntity.ch[fCurrentEntity.position++];
                          if (delimiter.charAt(i) != c) {
!                             fCurrentEntity.position -= i;
                              break;
                          }
                      }
***************
*** 2746,2756 ****
                      break;
                  }
              }
-             int length = fCurrentEntity.position - offset;
-             fCurrentEntity.columnNumber += length - newlines;
              if (done) {
!                 length -= delimLen;
              }
              data.setValues(fCurrentEntity.ch, offset, length);
  
              // return true if string was skipped
--- 2747,2757 ----
                      break;
                  }
              }
              if (done) {
!                 fCurrentEntity.position -= delimLen;
              }
+             int length = fCurrentEntity.position - offset;
+             fCurrentEntity.columnNumber += length - newlines;
              data.setValues(fCurrentEntity.ch, offset, length);
  
              // return true if string was skipped
Index: java/src/org/apache/xerces/impl/XMLScanner.java
===================================================================
RCS file:
/home/cvspublic/xml-xerces/java/src/org/apache/xerces/impl/XMLScanner.java,v
retrieving revision 1.12
diff -u -w -c -r1.12 XMLScanner.java
*** java/src/org/apache/xerces/impl/XMLScanner.java 29 Jan 2002 20:44:02
-0000 1.12
--- java/src/org/apache/xerces/impl/XMLScanner.java 21 Apr 2002 00:59:39
-0000
***************
*** 633,638 ****
--- 633,640 ----
              data.setValues(fStringBuffer);
          }
  
+         fEntityScanner.skipString("?>");
+ 
      } // scanPIData(String,XMLString)
  
      /**
***************
*** 670,675 ****
--- 672,678 ----
              }
          }
          text.append(fString);
+         fEntityScanner.skipString("--");
          if (!fEntityScanner.skipChar('>')) {
              reportFatalError("DashDashInComment", null);
          }
-+-+-+-+-+-+-+-+-+-+

Nick.
---
"I think of programming with beauty in mind, as being something elegant,
something that you can be proud of for the way it fits together." Donald E.
Knuth.

---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-dev-help@xml.apache.org