You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@subversion.apache.org by Carl Henrik Bernhoft <ch...@bernhoft.no> on 2022/09/29 12:11:24 UTC

Unable to pull full commit history; Malformed XML not well-formed (invalid token)

Using SVN version:
svn, version 1.14.1 (r1886195)
  compiled May 21 2022, 10:52:35 on x86_64-pc-linux-gnu.

I tried pulling the commit log from
https://github.com/haskell/random/branches/master but the process snags on
https://github.com/haskell/random/commit/a44b801ab0033970660396a42462c4f7b4df56bb
which
corresponds to revision 897.

The error is reproducible from the command line with
svn log --xml -r897 https://github.com/haskell/random/branches/master

The output is
svn: E175009: The XML response contains invalid XML
svn: E130003: Malformed XML: not well-formed (invalid token) at line 8


I was expecting control characters to be stripped, fuzzified or otherwise
handled. The offending line in the commit, when printed with 'git log'
displays as: 'Merged Martin SjA?gren's patch for multiline descriptionsb'
which looks like reasonable output by comparison.

I'm saw that handling unicode and control characters was a topic of
discussion years ago but this case looks like a bug to me. It doesn't seem
reasonable for the process to crash when pulling a remote commit log of a
repo I don't own/control.

Re: Unable to pull full commit history; Malformed XML not well-formed (invalid token)

Posted by Daniel Sahlberg <da...@gmail.com>.
Den tors 29 sep. 2022 kl 14:15 skrev Carl Henrik Bernhoft <ch...@bernhoft.no>:

> Using SVN version:
> svn, version 1.14.1 (r1886195)
>   compiled May 21 2022, 10:52:35 on x86_64-pc-linux-gnu.
>
> I tried pulling the commit log from
> https://github.com/haskell/random/branches/master but the process snags
> on
> https://github.com/haskell/random/commit/a44b801ab0033970660396a42462c4f7b4df56bb which
> corresponds to revision 897.
>
> The error is reproducible from the command line with
> svn log --xml -r897 https://github.com/haskell/random/branches/master
>
> The output is
> svn: E175009: The XML response contains invalid XML
> svn: E130003: Malformed XML: not well-formed (invalid token) at line 8
>
>
The error reproduces without the --xml argument as well:

C:\Users\dsg>svn log -r897 https://github.com/haskell/random/branches/master
svn: E175009: The XML response contains invalid XML
svn: E130003: Malformed XML: not well-formed (invalid token) at line 8

I suppose the "XML" mentioned in the error message is because the http /
webdav protocol is XML based. XML is quite picky about encoding certain
characters.

I was expecting control characters to be stripped, fuzzified or otherwise
> handled. The offending line in the commit, when printed with 'git log'
> displays as: 'Merged Martin SjA?gren's patch for multiline descriptionsb'
> which looks like reasonable output by comparison.
>
> I'm saw that handling unicode and control characters was a topic of
> discussion years ago but this case looks like a bug to me. It doesn't seem
> reasonable for the process to crash when pulling a remote commit log of a
> repo I don't own/control.
>

I'm guessing - but didn't verify yet since I couldn't figure out a way to
quickly sniff the network traffic - that Github's servers are not encoding
the character properly when they send it to the client. My google-fu wasn't
enough to find if the error has been discussed on the GitHub side.

Kind regards,
Daniel

Re: Unable to pull full commit history; Malformed XML not well-formed (invalid token)

Posted by Jon Daley via users <us...@subversion.apache.org>.
Interesting.  I can confirm the behavior (I have the same version as you). 
Unfortunately, I can't help you.  Google seems to say the repo is 
corrupted, and needs to be fixed.

On Thu, 29 Sep 2022, Carl Henrik Bernhoft wrote:

> Using SVN version:
> svn, version 1.14.1 (r1886195)
>  compiled May 21 2022, 10:52:35 on x86_64-pc-linux-gnu.
>
> I tried pulling the commit log from
> https://github.com/haskell/random/branches/master but the process snags on
> https://github.com/haskell/random/commit/a44b801ab0033970660396a42462c4f7b4df56bb
> which
> corresponds to revision 897.
>
> The error is reproducible from the command line with
> svn log --xml -r897 https://github.com/haskell/random/branches/master
>
> The output is
> svn: E175009: The XML response contains invalid XML
> svn: E130003: Malformed XML: not well-formed (invalid token) at line 8
>
>
> I was expecting control characters to be stripped, fuzzified or otherwise
> handled. The offending line in the commit, when printed with 'git log'
> displays as: 'Merged Martin SjA?gren's patch for multiline descriptionsb'
> which looks like reasonable output by comparison.
>
> I'm saw that handling unicode and control characters was a topic of
> discussion years ago but this case looks like a bug to me. It doesn't seem
> reasonable for the process to crash when pulling a remote commit log of a
> repo I don't own/control.
>

-- 
Jon Daley
https://jon.limedaley.com
~~
Live your life around the word of God and especially the Gospel.
-- Greg Gill