You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@xmlbeans.apache.org by "Peter Rodgers (JIRA)" <xm...@xml.apache.org> on 2006/06/01 16:51:31 UTC

[jira] Created: (XMLBEANS-274) Over zelous whitespace cropping after parsing entity like &

Over zelous whitespace cropping after parsing entity like &amp;
---------------------------------------------------------------

         Key: XMLBEANS-274
         URL: http://issues.apache.org/jira/browse/XMLBEANS-274
     Project: XMLBeans
        Type: Bug

    Versions: Version 2.1    
 Environment: All
    Reporter: Peter Rodgers


When white space stripping is specified the parser does not detect XML entities such as &amp; and strips the whitespace following each entity.

For example

<root>dog &amp; cat</root>

is parsed as

<root>doc &amp;cat</root>

The cause of the problem is the stripLeft() method in the org.apache.xmlbeans.impl.store.CharUtil

Below is a fixed version of the method that detects the ';' character after an entity which indicates that whitespace is significant and must be preserved.  Note this code does not fix the case where the iteration is a for loop.

public Object stripLeft ( Object src, int off, int cch )
    {
        assert isValid( src, off, cch );

        if (cch > 0)
        {
            if (src instanceof char[])
            {
                char[] chars = (char[]) src;

                while ( cch > 0 && isWhiteSpace( chars[ off ] ) && chars[off - 1]!=';' )  //Fix for &amp; etc
                    { cch--; off++; }
            }
            else if (src instanceof String)
            {
                String s = (String) src;

                while ( cch > 0 && isWhiteSpace( s.charAt( off ) ) && s.charAt(off - 1)!=';' ) //Fix for &amp; etc
                    { cch--; off++; }
            }
            else
            {
                int count = 0;
                
                for ( _charIter.init( src, off, cch ) ; _charIter.hasNext() ; count++ )
                    if (!isWhiteSpace( _charIter.next() ))
                        break;
                
                _charIter.release();

                off += count;
            }
        }

        if (cch == 0)
        {
            _offSrc = 0;
            _cchSrc = 0;
            
            return null;
        }

        _offSrc = off;
        _cchSrc = cch;

        return src;
    }

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@xmlbeans.apache.org
For additional commands, e-mail: dev-help@xmlbeans.apache.org


[jira] Updated: (XMLBEANS-274) Over zealous whitespace cropping after parsing entity like &

Posted by "Peter Rodgers (JIRA)" <xm...@xml.apache.org>.
     [ http://issues.apache.org/jira/browse/XMLBEANS-274?page=all ]

Peter Rodgers updated XMLBEANS-274:
-----------------------------------

        Summary: Over zealous whitespace cropping after parsing entity like &amp;  (was: Over zelous whitespace cropping after parsing entity like &amp;)
    Description: 
When white space stripping is specified the parser does not detect XML entities such as &amp; and strips the whitespace following each entity.

For example

<root>dog &amp; cat</root>

is parsed as

<root>dog &amp;cat</root>

The cause of the problem is the stripLeft() method in the org.apache.xmlbeans.impl.store.CharUtil

Below is a fixed version of the method that detects the ';' character after an entity which indicates that whitespace is significant and must be preserved.  Note this code does not fix the case where the iteration is a for loop.

public Object stripLeft ( Object src, int off, int cch )
    {
        assert isValid( src, off, cch );

        if (cch > 0)
        {
            if (src instanceof char[])
            {
                char[] chars = (char[]) src;

                while ( cch > 0 && isWhiteSpace( chars[ off ] ) && chars[off - 1]!=';' )  //Fix for &amp; etc
                    { cch--; off++; }
            }
            else if (src instanceof String)
            {
                String s = (String) src;

                while ( cch > 0 && isWhiteSpace( s.charAt( off ) ) && s.charAt(off - 1)!=';' ) //Fix for &amp; etc
                    { cch--; off++; }
            }
            else
            {
                int count = 0;
                
                for ( _charIter.init( src, off, cch ) ; _charIter.hasNext() ; count++ )
                    if (!isWhiteSpace( _charIter.next() ))
                        break;
                
                _charIter.release();

                off += count;
            }
        }

        if (cch == 0)
        {
            _offSrc = 0;
            _cchSrc = 0;
            
            return null;
        }

        _offSrc = off;
        _cchSrc = cch;

        return src;
    }

  was:
When white space stripping is specified the parser does not detect XML entities such as &amp; and strips the whitespace following each entity.

For example

<root>dog &amp; cat</root>

is parsed as

<root>doc &amp;cat</root>

The cause of the problem is the stripLeft() method in the org.apache.xmlbeans.impl.store.CharUtil

Below is a fixed version of the method that detects the ';' character after an entity which indicates that whitespace is significant and must be preserved.  Note this code does not fix the case where the iteration is a for loop.

public Object stripLeft ( Object src, int off, int cch )
    {
        assert isValid( src, off, cch );

        if (cch > 0)
        {
            if (src instanceof char[])
            {
                char[] chars = (char[]) src;

                while ( cch > 0 && isWhiteSpace( chars[ off ] ) && chars[off - 1]!=';' )  //Fix for &amp; etc
                    { cch--; off++; }
            }
            else if (src instanceof String)
            {
                String s = (String) src;

                while ( cch > 0 && isWhiteSpace( s.charAt( off ) ) && s.charAt(off - 1)!=';' ) //Fix for &amp; etc
                    { cch--; off++; }
            }
            else
            {
                int count = 0;
                
                for ( _charIter.init( src, off, cch ) ; _charIter.hasNext() ; count++ )
                    if (!isWhiteSpace( _charIter.next() ))
                        break;
                
                _charIter.release();

                off += count;
            }
        }

        if (cch == 0)
        {
            _offSrc = 0;
            _cchSrc = 0;
            
            return null;
        }

        _offSrc = off;
        _cchSrc = cch;

        return src;
    }


> Over zealous whitespace cropping after parsing entity like &amp;
> ----------------------------------------------------------------
>
>          Key: XMLBEANS-274
>          URL: http://issues.apache.org/jira/browse/XMLBEANS-274
>      Project: XMLBeans
>         Type: Bug

>     Versions: Version 2.1
>  Environment: All
>     Reporter: Peter Rodgers

>
> When white space stripping is specified the parser does not detect XML entities such as &amp; and strips the whitespace following each entity.
> For example
> <root>dog &amp; cat</root>
> is parsed as
> <root>dog &amp;cat</root>
> The cause of the problem is the stripLeft() method in the org.apache.xmlbeans.impl.store.CharUtil
> Below is a fixed version of the method that detects the ';' character after an entity which indicates that whitespace is significant and must be preserved.  Note this code does not fix the case where the iteration is a for loop.
> public Object stripLeft ( Object src, int off, int cch )
>     {
>         assert isValid( src, off, cch );
>         if (cch > 0)
>         {
>             if (src instanceof char[])
>             {
>                 char[] chars = (char[]) src;
>                 while ( cch > 0 && isWhiteSpace( chars[ off ] ) && chars[off - 1]!=';' )  //Fix for &amp; etc
>                     { cch--; off++; }
>             }
>             else if (src instanceof String)
>             {
>                 String s = (String) src;
>                 while ( cch > 0 && isWhiteSpace( s.charAt( off ) ) && s.charAt(off - 1)!=';' ) //Fix for &amp; etc
>                     { cch--; off++; }
>             }
>             else
>             {
>                 int count = 0;
>                 
>                 for ( _charIter.init( src, off, cch ) ; _charIter.hasNext() ; count++ )
>                     if (!isWhiteSpace( _charIter.next() ))
>                         break;
>                 
>                 _charIter.release();
>                 off += count;
>             }
>         }
>         if (cch == 0)
>         {
>             _offSrc = 0;
>             _cchSrc = 0;
>             
>             return null;
>         }
>         _offSrc = off;
>         _cchSrc = cch;
>         return src;
>     }

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@xmlbeans.apache.org
For additional commands, e-mail: dev-help@xmlbeans.apache.org


[jira] Resolved: (XMLBEANS-274) Over zealous whitespace cropping after parsing entity like &

Posted by "Cezar Andrei (JIRA)" <xm...@xml.apache.org>.
     [ http://issues.apache.org/jira/browse/XMLBEANS-274?page=all ]

Cezar Andrei resolved XMLBEANS-274.
-----------------------------------

    Fix Version/s: TBD
       Resolution: Fixed

The fix is a little bit more complicated, the CurLoadContext had to be modified to avoid leftStrip of consecutive text events.

> Over zealous whitespace cropping after parsing entity like &amp;
> ----------------------------------------------------------------
>
>                 Key: XMLBEANS-274
>                 URL: http://issues.apache.org/jira/browse/XMLBEANS-274
>             Project: XMLBeans
>          Issue Type: Bug
>    Affects Versions: Version 2.1
>         Environment: All
>            Reporter: Peter Rodgers
>             Fix For: TBD
>
>
> When white space stripping is specified the parser does not detect XML entities such as &amp; and strips the whitespace following each entity.
> For example
> <root>dog &amp; cat</root>
> is parsed as
> <root>dog &amp;cat</root>
> The cause of the problem is the stripLeft() method in the org.apache.xmlbeans.impl.store.CharUtil
> Below is a fixed version of the method that detects the ';' character after an entity which indicates that whitespace is significant and must be preserved.  Note this code does not fix the case where the iteration is a for loop.
> public Object stripLeft ( Object src, int off, int cch )
>     {
>         assert isValid( src, off, cch );
>         if (cch > 0)
>         {
>             if (src instanceof char[])
>             {
>                 char[] chars = (char[]) src;
>                 while ( cch > 0 && isWhiteSpace( chars[ off ] ) && chars[off - 1]!=';' )  //Fix for &amp; etc
>                     { cch--; off++; }
>             }
>             else if (src instanceof String)
>             {
>                 String s = (String) src;
>                 while ( cch > 0 && isWhiteSpace( s.charAt( off ) ) && s.charAt(off - 1)!=';' ) //Fix for &amp; etc
>                     { cch--; off++; }
>             }
>             else
>             {
>                 int count = 0;
>                 
>                 for ( _charIter.init( src, off, cch ) ; _charIter.hasNext() ; count++ )
>                     if (!isWhiteSpace( _charIter.next() ))
>                         break;
>                 
>                 _charIter.release();
>                 off += count;
>             }
>         }
>         if (cch == 0)
>         {
>             _offSrc = 0;
>             _cchSrc = 0;
>             
>             return null;
>         }
>         _offSrc = off;
>         _cchSrc = cch;
>         return src;
>     }

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@xmlbeans.apache.org
For additional commands, e-mail: dev-help@xmlbeans.apache.org