You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by "Andreas J. Koenig" <an...@anima.de> on 2002/12/04 10:13:51 UTC

Bug: Control char in commit message

This one took me days to track down. Although some might argue, it's a
misuse, it's actually so easy to fall into the trap that I'd call it a
serious bug.

Note, that the 'foo ^H bar' message below contained a literal
Control-H (0x08):

% echo foo > foo
% svn add foo
A         foo
% svn ci -m 'foo ^H bar'
subversion/libsvn_client/commit.c:655: (apr_err=175002, src_err=0)
svn: RA layer request failed
svn: Commit failed (details follow):
subversion/libsvn_ra_dav/util.c:81: (apr_err=175002, src_err=0)
svn: applying log message to /svn/test/!svn/wbl/84abf719-f6b0-0310-847f-9d47c53dd84d/0: 400 Bad Request


For people who do not enter a commit message manually but take it from
a file it just happens too easily. So either it must be documented as
a restriction or fixed.

Let me know if I shall file an issue about it.

Thanks,
-- 
andreas

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Bug: Control char in commit message

Posted by "Andreas J. Koenig" <an...@anima.de>.
>>>>> On 04 Dec 2002 15:46:04 -0600, Karl Fogel <kf...@newton.ch.collab.net> said:

  > andreas.koenig@anima.de (Andreas J. Koenig) writes:
 >> Sorry, my C skills aren't up to that.

  > Okay, no problem.  In that case, can you file an issue, distilled from
  > the various mails in this thread?

Filed.

    http://subversion.tigris.org/issues/show_bug.cgi?id=1025

-- 
andreas

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Bug: Control char in commit message

Posted by Karl Fogel <kf...@newton.ch.collab.net>.
andreas.koenig@anima.de (Andreas J. Koenig) writes:
> Sorry, my C skills aren't up to that.

Okay, no problem.  In that case, can you file an issue, distilled from
the various mails in this thread?

Thanks,
-K


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Bug: Control char in commit message

Posted by "Andreas J. Koenig" <an...@anima.de>.
>>>>> On 04 Dec 2002 11:31:34 -0600, Karl Fogel <kf...@newton.ch.collab.net> said:

  > andreas.koenig@anima.de (Andreas J. Koenig) writes:
 >> I cannot run my server under gdb unless you give me detailed
 >> instructions how to do that.

  > Sure -- it's in HACKING file, see the section "Debugging the server".

  > (I saw the rest of your mail, not sure yet what the symptoms signify,
  > so I'm hoping you have time to run the server under gdb and catch it
  > in the act...)

Sorry, my C skills aren't up to that.


-- 
andreas

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Bug: Control char in commit message

Posted by Karl Fogel <kf...@newton.ch.collab.net>.
andreas.koenig@anima.de (Andreas J. Koenig) writes:
> I cannot run my server under gdb unless you give me detailed
> instructions how to do that.

Sure -- it's in HACKING file, see the section "Debugging the server".

(I saw the rest of your mail, not sure yet what the symptoms signify,
so I'm hoping you have time to run the server under gdb and catch it
in the act...)

-K

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Bug: Control char in commit message

Posted by "Andreas J. Koenig" <an...@anima.de>.
>>>>> On 04 Dec 2002 08:46:03 -0600, Karl Fogel <kf...@newton.ch.collab.net> said:

  > Thanks for the investigation!
  > The log message is being recoded from locale to UTF-8 for commit.  So
  > that's one step that can fail.  Assuming it does not fail (that is,
  > the conversion returns success, even though the source may be a bunch
  > of bogus characters that just happen to all have encodings within the
  > locale), then the UTF-8 should be able to be applied over any ra
  > layer.  If it can't be sent, that's a bug.

  > I'm not sure exactly where or why your commit failed.  It was a recent
  > version of svn, right?

Yes, rev. 3953.

  > Can you trace the error to the line in source
  > where it happens?  We probably should file an issue for this, but
  > let's drill down a bit first.

Maybe this error message from the apache error_log helps?

    [Wed Dec 04 13:07:53 2002] [error] [client 127.0.0.1] XML parser error code: not well-formed (invalid token) (4)

I cannot run my server under gdb unless you give me detailed
instructions how to do that. Here's a short test case:

    % svn co http://localhost/svn/test test-wc
    k's password: 
    
    Checked out revision 0.
    % cd test-wc 
    % echo foo > foo
    % svn add foo
    A         foo
    % svn ci -m '^H'
    subversion/libsvn_client/commit.c:655: (apr_err=175002, src_err=0)
    svn: RA layer request failed
    svn: Commit failed (details follow):
    subversion/libsvn_ra_dav/util.c:81: (apr_err=175002, src_err=0)
    svn: applying log message to /svn/test/!svn/wbl/6dc2176b-fbb0-0310-9734-e6136f8a9d49/0: 400 Bad Request
    

I find this error message in ./srclib/apr-util/xml/apr_xml.c in the
APR-UTIL sources:

    case APR_XML_ERROR_EXPAT:
        (void) apr_snprintf(errbuf, errbufsize,
                            "XML parser error code: %s (%d)",
                            XML_ErrorString(parser->xp_err), parser->xp_err);
        return errbuf;


This is what I find in the XML standard
(http://www.w3.org/TR/REC-xml#charsets):

    Character Range
    [2]    Char    ::=    #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF] /* any Unicode character, excluding the surrogate blocks, FFFE, and FFFF. */
    
Does this ring a bell?


-- 
andreas

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Bug: Control char in commit message

Posted by Karl Fogel <kf...@newton.ch.collab.net>.
Thanks for the investigation!

The log message is being recoded from locale to UTF-8 for commit.  So
that's one step that can fail.  Assuming it does not fail (that is,
the conversion returns success, even though the source may be a bunch
of bogus characters that just happen to all have encodings within the
locale), then the UTF-8 should be able to be applied over any ra
layer.  If it can't be sent, that's a bug.

I'm not sure exactly where or why your commit failed.  It was a recent
version of svn, right?  Can you trace the error to the line in source
where it happens?  We probably should file an issue for this, but
let's drill down a bit first.

Thanks,
-Karl

andreas.koenig@anima.de (Andreas J. Koenig) writes:
> Following up to my own bugreport (^H not allowed in commit message):
> 
> To provide further evidence, I checked commit messages with all
> 256 characters and found that
> 
> 1. 0x0 is allowed but is interpreted as end of string, so all
>    characters after it are cut off
> 
> 2. decimal character positions 1-8, 11-12, 14-31 are disallowed
> 
> 3. Control-M (0xd) may lead to troubles if svn decides that you have
>    "inconsistent line-endings in source stream, repair flag is off."
> 
> I have no good suggestion how this interface *should* work, I'll do my
> escaping according to the findings above with perl
> 
>     s/([\000-\010\013-\037])/"^".pack("c",ord($1)^64)/eg;
> 
> and recommend others to do likewise until this issue is resolved.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Bug: Control char in commit message

Posted by "Andreas J. Koenig" <an...@anima.de>.
Following up to my own bugreport (^H not allowed in commit message):

To provide further evidence, I checked commit messages with all
256 characters and found that

1. 0x0 is allowed but is interpreted as end of string, so all
   characters after it are cut off

2. decimal character positions 1-8, 11-12, 14-31 are disallowed

3. Control-M (0xd) may lead to troubles if svn decides that you have
   "inconsistent line-endings in source stream, repair flag is off."

I have no good suggestion how this interface *should* work, I'll do my
escaping according to the findings above with perl

    s/([\000-\010\013-\037])/"^".pack("c",ord($1)^64)/eg;

and recommend others to do likewise until this issue is resolved.


-- 
andreas

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org