You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@xmlbeans.apache.org by "Jacob Danner (JIRA)" <xm...@xml.apache.org> on 2005/04/14 00:56:34 UTC

[jira] Updated: (XMLBEANS-135) bad handling of embeded CDATA

     [ http://issues.apache.org/jira/browse/XMLBEANS-135?page=history ]

Jacob Danner updated XMLBEANS-135:
----------------------------------

    Fix Version: TBD

probably not in the v2 release

> bad handling of embeded CDATA
> -----------------------------
>
>          Key: XMLBEANS-135
>          URL: http://issues.apache.org/jira/browse/XMLBEANS-135
>      Project: XMLBeans
>         Type: Bug
>     Versions: Version 1.0.3, Version 1.0.4, Version 2 Beta 1
>  Environment: I arrived to it on windows with jdk 1.4.2. 
>     Reporter: Martin Hamel
>      Fix For: TBD

>
> I have a case of bad xml. It is an envelope document that includes another 
> document. The parser expect the enclosed document to be in CDATA. The problem 
> is that the second document now include a third document which is also 
> expected to be a CDATA. 
> I create document A with an XMLBean. I put it has a text element of document B 
> after I transformed Document A to a string with xmlText(). I then do the same 
> with document B by putting it in Document C. Everything works well and 
> automatically and it creates CDATA everytime it needs to.
>         //fragment
>  XmlOptions options = new XmlOptions();
>         options.setSavePrettyPrint();
>         Field field = getAssessmentFields().addNewField();
>         field.setFieldName("AssessmentContent");
>         field.setFieldValue(answersDocument.xmlText(options));
>   ..
> The problem is that on the second escaping the CDATA end ([[>)is escaped to 
> "&gt;". The SAX parser that read all this (Xalan) just can't do it. Also, the 
> specification says that there should not be any CDATA containing a CDATA.
> Here is the modification I made for embeded CDATA. Do you think that would be 
> worty of beeing included?
> here is the entitizeContent method in Saver.java:
>         Pattern cdataPattern = Pattern.compile("CDATA");
>         private void entitizeContent ( )
>         {
>             if (_lastEmitCch == 0)
>                 return;
>             int i = _lastEmitIn;
>             final int n = _buf.length;
>             boolean hasOutOfRange = false;
>             
>             int count = 0;
>             for ( int cch = _lastEmitCch ; cch > 0 ; cch-- )
>             {                
>                 char ch = _buf[ i ];
>                 if (ch == '<' || ch == '&')
>                     count++;
>                 else if (isBadChar( ch ))
>                     hasOutOfRange = true;
>                 if (++i == n)
>                     i = 0;
>             }
>             if (count == 0 && !hasOutOfRange)
>                 return;
>             i = _lastEmitIn;
>             //
>             // Heuristic for knowing when to save out stuff as a CDATA.
>             //
>             
>             // Well check if we have a cdata in the buffer.
>             // If we do, we won't nest another one.
>             CharBuffer charBuffer = CharBuffer.wrap(_buf);
>             boolean hasCDATA = cdataPattern.matcher(charBuffer).find();            
>             if (_lastEmitCch > 32 && count > 5 &&
>                     count * 100 / _lastEmitCch > 1 && !hasCDATA)
>               { 
>                 boolean lastWasBracket = _buf[ i ] == ']';
>                 i = replace( i, "<![CDATA[" + _buf[ i ] );
>                 boolean secondToLastWasBracket = lastWasBracket;
>                 lastWasBracket = _buf[ i ] == ']';
>                 if (++i == _buf.length)
>                     i = 0;
>                 for ( int cch = _lastEmitCch ; cch > 0 ; cch-- )
>                 {
>                     char ch = _buf[ i ];
>                     if (ch == '>' && secondToLastWasBracket && lastWasBracket)
>                         i = replace( i, "&gt;" );
>                     else if (isBadChar( ch ))
>                         i = replace( i, "?" );
>                     else
>                         i++;
>                     secondToLastWasBracket = lastWasBracket;
>                     lastWasBracket = ch == ']';
>                     if (i == _buf.length)
>                         i = 0;
>                 }
>                 emit( "]]>" );
>             }
>             else
>             {
>                 for ( int cch = _lastEmitCch ; cch > 0 ; cch-- )
>                 {
>                     char ch = _buf[ i ];
>                     if (ch == '<')
>                         i = replace( i, "&lt;" );
>                     else if (hasCDATA && ch == '>')
>                         i = replace(i, "&gt;");
>                     else if (ch == '&')
>                         i = replace( i, "&amp;" );
>                     else if (isBadChar( ch ))
>                         i = replace( i, "?" );
>                     else
>                         i++;
>                     if (i == _buf.length)
>                         i = 0;
>                 }
>             }
>         }

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
If you want more information on JIRA, or have a bug to report see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@xmlbeans.apache.org
For additional commands, e-mail: dev-help@xmlbeans.apache.org