You are viewing a plain text version of this content. The canonical link for it is here.
Posted to c-dev@axis.apache.org by "Bill Mitchell (JIRA)" <ji...@apache.org> on 2008/02/02 16:50:08 UTC

[jira] Updated: (AXIS2C-859) guththila parser fails to handle escape sequences for ampersand, less than, greater than

     [ https://issues.apache.org/jira/browse/AXIS2C-859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bill Mitchell updated AXIS2C-859:
---------------------------------

    Attachment: guththila_xml_writer.diff
                diff_2.txt

Lahiru, looking at the code again, I now agree that you were right to replace the character by sliding the token data down.  I was under the mistaken impression that the code was sliding all the rest of the buffer down; as long as we are sliding from one end or the other of the token, there is no reason not do the obvious slide down.  

In the attached diff_2.txt, I moved the code to perform the replacement into a lower level routine.  As guththila_close_token has constructed a temp token in both the text case and the attribute value case, it is easy to perform replacement on this temp token string before further processing of the attribute for a namespace declaration.  Beware that the line number where we change the token type to _text may be different in yours; my version includes changes for AXIS2C-933 that Supun wants to review before they are applied.  

Separately, in the attached guththila_xml_writer.diff, is a patch to other side of this issue, the insertion of character sequences on outgoing messages that include ampersand or greater than in the text.  

With both fixes installed, I was able to see ampersand data characters from the client arrive at the server intact, and vice versa.  

> guththila parser fails to handle escape sequences for ampersand, less than, greater than
> ----------------------------------------------------------------------------------------
>
>                 Key: AXIS2C-859
>                 URL: https://issues.apache.org/jira/browse/AXIS2C-859
>             Project: Axis2-C
>          Issue Type: Bug
>          Components: guththila
>    Affects Versions: Current (Nightly)
>         Environment: Windows XP, Visual Studio 2005, guththila parser, libcurl
>            Reporter: Bill Mitchell
>         Attachments: diff.txt, diff_1.txt, diff_2.txt, guththila_xml_writer.diff
>
>
> When an incoming message contains within text the escaped ampersand sequence, "&amp;", this sequence is being passed to the client as raw text without being converted to the single ampersand character.  Clearly, this action must take place at the level of the parser, as only the parser knows whether it is seeing simple text, and conversion is required, or text embedded in a CDATA section, where conversion is not allowed.  I have tested the build with the libxml parser, and of course the libxml parser behaves correctly: the text passed to the client contains only the single ampersand character, not the escaped sequence.  (See section 2.4 of XML 1.0 spec.)
> Looking at the code, I expect the same problem occurs with all escaped sequences, less than and greater than as well as ampersand, on both input and output.  I also don't see where CDATA sections are handled, but as I am not seeing CDATA in the messages from the service I am hitting, I have not tested this case.  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: axis-c-dev-unsubscribe@ws.apache.org
For additional commands, e-mail: axis-c-dev-help@ws.apache.org