You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@cxf.apache.org by "Freeman Fang (JIRA)" <ji...@apache.org> on 2018/10/15 06:53:00 UTC

[jira] [Comment Edited] (CXF-7873) CRLF replaced by LF

    [ https://issues.apache.org/jira/browse/CXF-7873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16649777#comment-16649777 ] 

Freeman Fang edited comment on CXF-7873 at 10/15/18 6:52 AM:
-------------------------------------------------------------

Hi [~bengri6],

This is from the default behavior of com.sun.xml.bind.marshaller.MinimumEscapeHandler which is introduced since CXF 3.2.2

{code}
public void escape(char[] ch, int start, int length, boolean isAttVal, Writer out) throws IOException {
        // avoid calling the Writerwrite method too much by assuming
        // that the escaping occurs rarely.
        // profiling revealed that this is faster than the naive code.
        int limit = start+length;
        for (int i = start; i < limit; i++) {
            char c = ch[i];
                if(c == '&' || c == '<' || c == '>' || c == '\r' || (c == '\"' && isAttVal) ) {
                if(i!=start)
                    out.write(ch,start,i-start);
                start = i+1;
                switch (ch[i]) {
                    case '&':
                        out.write("&amp;");
                        break;
                    case '<':
                        out.write("&lt;");
                        break;
                    case '>':
                        out.write("&gt;");
                        break;
                    case '\"':
                        out.write("&quot;");
                        break;
                }
            }
        }
        
        if( start!=limit )
            out.write(ch,start,limit-start);
    }
{code}

So the char "\r" is just escaped. And this behavior actually is per XML spec[1] 
{code}

2.11 End-of-Line Handling

XML parsed entities are often stored in computer files which, for editing convenience, are organized into lines. These lines are typically separated by some combination of the characters CARRIAGE RETURN (#xD) and LINE FEED (#xA).

To simplify the tasks of applications, the XML processor MUST behave as if it normalized all line breaks in external parsed entities (including the document entity) on input, before parsing, by translating both the two-character sequence #xD #xA and any #xD that is not followed by #xA to a single #xA character.
{code}


And actually you can set customized EscapeHandler to the jaxb marshaller through cxf marshallerProperties to override the default behavior. Please take a look at [2], the "JAXB Properties" part.

[1]https://www.w3.org/TR/xml/
[2]http://cxf.apache.org/docs/jaxb.html


Freeman


was (Author: ffang):
Hi [~bengri6],

This is from the default behavior of com.sun.xml.bind.marshaller.MinimumEscapeHandler which is introduced since CXF 3.2.2

{code}
public void escape(char[] ch, int start, int length, boolean isAttVal, Writer out) throws IOException {
        // avoid calling the Writerwrite method too much by assuming
        // that the escaping occurs rarely.
        // profiling revealed that this is faster than the naive code.
        int limit = start+length;
        for (int i = start; i < limit; i++) {
            char c = ch[i];
                if(c == '&' || c == '<' || c == '>' || c == '\r' || (c == '\"' && isAttVal) ) {
                if(i!=start)
                    out.write(ch,start,i-start);
                start = i+1;
                switch (ch[i]) {
                    case '&':
                        out.write("&amp;");
                        break;
                    case '<':
                        out.write("&lt;");
                        break;
                    case '>':
                        out.write("&gt;");
                        break;
                    case '\"':
                        out.write("&quot;");
                        break;
                }
            }
        }
        
        if( start!=limit )
            out.write(ch,start,limit-start);
    }
{code}

So the char "\r" is just escaped. And this behavior actually is per XML spec[1] 
{code}

2.11 End-of-Line Handling

XML parsed entities are often stored in computer files which, for editing convenience, are organized into lines. These lines are typically separated by some combination of the characters CARRIAGE RETURN (#xD) and LINE FEED (#xA).

To simplify the tasks of applications, the XML processor MUST behave as if it normalized all line breaks in external parsed entities (including the document entity) on input, before parsing, by translating both the two-character sequence #xD #xA and any #xD that is not followed by #xA to a single #xA character.
{code}


[1]https://www.w3.org/TR/xml/


Freeman

> CRLF replaced by LF
> -------------------
>
>                 Key: CXF-7873
>                 URL: https://issues.apache.org/jira/browse/CXF-7873
>             Project: CXF
>          Issue Type: Bug
>          Components: JAXB Databinding
>    Affects Versions: 3.2.2, 3.2.3, 3.2.4, 3.2.5, 3.2.6
>         Environment: Windows 10 / Windows Server 2016
> Eclipse Photon
> jdk1.8.0_181
> Tomcat or jetty or pojo
>            Reporter: Benoît Grimée
>            Assignee: Freeman Fang
>            Priority: Major
>         Attachments: java_first_pojo_crlf_issue.zip
>
>
> If I send a string that contains a CRLF to a web service's operation, the server removes the CR character.
> If I replace the cxf-rt-databinding-jaxb artifact by any of previous version 3.2.1 or lower, no CR are lost.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)