You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Evgeny Kotkov <ev...@visualsvn.com> on 2016/11/23 12:31:30 UTC

Failed fs-test 65 (Was: Re: 1.9.5 up for signing/testing)

Branko Čibej <br...@apache.org> writes:

> New issues:
>
>   - fs-test 65 fails with SQLite 3.8.10.2:
>     [[[
>     svn_tests: E200006: Expected error SVN_ERR_SQLITE_BUSY but got
>       SVN_ERR_SQLITE_ERROR
>     svn_tests: E200030: sqlite[S10]: disk I/O error
>     svn_tests: E200042: Additional errors:
>     svn_tests: E200030: sqlite[S10]: disk I/O error
>     svn_tests: E200044: SQLite transaction rollback failed
>     svn_tests: E200030: sqlite[S1]: cannot rollback - no transaction is active
>     svn_tests: E200030: sqlite[S1]: cannot rollback - no transaction is active
>     FAIL:  fs-test 65: test commit with locked rep-cache
>     ]]]

Hi Branko,

Is this failure reproducible?  Does it happen if you run just fs-test#65?

I tried to witness the failure on my Windows and Linux machines with
SQLite 3.8.10.2, but the test passes for me (the relevant sqlite-test#2
also works fine).


Regards,
Evgeny Kotkov

Re: Failed fs-test 65 (Was: Re: 1.9.5 up for signing/testing)

Posted by Branko Čibej <br...@apache.org>.
On 23.11.2016 18:52, Daniel Shahaf wrote:
> Branko \u010cibej wrote on Wed, Nov 23, 2016 at 18:37:24 +0100:
>> On 23.11.2016 13:31, Evgeny Kotkov wrote:
>>> Branko \u010cibej <br...@apache.org> writes:
>>>
>>>> New issues:
>>>>
>>>>   - fs-test 65 fails with SQLite 3.8.10.2:
>>>>     [[[
>>>>     svn_tests: E200006: Expected error SVN_ERR_SQLITE_BUSY but got
>>>>       SVN_ERR_SQLITE_ERROR
>>>>     svn_tests: E200030: sqlite[S10]: disk I/O error
>>>>     svn_tests: E200042: Additional errors:
>>>>     svn_tests: E200030: sqlite[S10]: disk I/O error
>>>>     svn_tests: E200044: SQLite transaction rollback failed
>>>>     svn_tests: E200030: sqlite[S1]: cannot rollback - no transaction is active
>>>>     svn_tests: E200030: sqlite[S1]: cannot rollback - no transaction is active
>>>>     FAIL:  fs-test 65: test commit with locked rep-cache
>>>>     ]]]
>>> Hi Branko,
>>>
>>> Is this failure reproducible?  Does it happen if you run just fs-test#65?
>>>
>>> I tried to witness the failure on my Windows and Linux machines with
>>> SQLite 3.8.10.2, but the test passes for me (the relevant sqlite-test#2
>>> also works fine).
>> It is reproducible whether run as a single test, all of fs-test or the
>> whole test suite; but only with SQLite 3.8.10.2. I suspect it is a test
>> bug, but haven't followed up; could be due to a bug in SQLite itself,
>> e.g., that the older version returns the wrong error code.
>>
>> The only potential problem as far as I can see is that 3.8.10.2 is the
>> version of SQLite shipped with OSX \u2014 hence, the version that most
>> binaries will be linking with.
> Does the test pass if you patch it to expect SVN_ERR_SQLITE_ERROR?  Once
> the shared lock is released, do subsequent commits operate normally
> (finish timely and update rep-cache.db)?
>
> I.e., does this failure mode have any impact beyond the wrong
> apr_status_t value being returned by svn_fs_commit_txn()?

Yes, it passes if I change that one line in the test.

I just double-checked our rep-cache and SQLite wrapper code and I don't
see any way that SQLite could've returned SQLITE_BUSY and we'd return
SVN_ERR_SQLITE_ERROR instead.

-- Brane

Re: Failed fs-test 65 (Was: Re: 1.9.5 up for signing/testing)

Posted by Daniel Shahaf <d....@daniel.shahaf.name>.
Branko \u010cibej wrote on Wed, Nov 23, 2016 at 18:37:24 +0100:
> On 23.11.2016 13:31, Evgeny Kotkov wrote:
> > Branko \u010cibej <br...@apache.org> writes:
> >
> >> New issues:
> >>
> >>   - fs-test 65 fails with SQLite 3.8.10.2:
> >>     [[[
> >>     svn_tests: E200006: Expected error SVN_ERR_SQLITE_BUSY but got
> >>       SVN_ERR_SQLITE_ERROR
> >>     svn_tests: E200030: sqlite[S10]: disk I/O error
> >>     svn_tests: E200042: Additional errors:
> >>     svn_tests: E200030: sqlite[S10]: disk I/O error
> >>     svn_tests: E200044: SQLite transaction rollback failed
> >>     svn_tests: E200030: sqlite[S1]: cannot rollback - no transaction is active
> >>     svn_tests: E200030: sqlite[S1]: cannot rollback - no transaction is active
> >>     FAIL:  fs-test 65: test commit with locked rep-cache
> >>     ]]]
> > Hi Branko,
> >
> > Is this failure reproducible?  Does it happen if you run just fs-test#65?
> >
> > I tried to witness the failure on my Windows and Linux machines with
> > SQLite 3.8.10.2, but the test passes for me (the relevant sqlite-test#2
> > also works fine).
> 
> It is reproducible whether run as a single test, all of fs-test or the
> whole test suite; but only with SQLite 3.8.10.2. I suspect it is a test
> bug, but haven't followed up; could be due to a bug in SQLite itself,
> e.g., that the older version returns the wrong error code.
> 
> The only potential problem as far as I can see is that 3.8.10.2 is the
> version of SQLite shipped with OSX \u2014 hence, the version that most
> binaries will be linking with.

Does the test pass if you patch it to expect SVN_ERR_SQLITE_ERROR?  Once
the shared lock is released, do subsequent commits operate normally
(finish timely and update rep-cache.db)?

I.e., does this failure mode have any impact beyond the wrong
apr_status_t value being returned by svn_fs_commit_txn()?

Cheers,

Daniel

Re: Failed fs-test 65 (Was: Re: 1.9.5 up for signing/testing)

Posted by Branko Čibej <br...@apache.org>.
On 24.11.2016 12:06, Evgeny Kotkov wrote:
> Evgeny Kotkov <ev...@visualsvn.com> writes:
>
>>> The question is: what do we do about it? Complaining to Apple isn't
>>> likely to help. We could add a special case in the testcase for that
>>> version of SQLite on OSX, just to keep the test output in the green. But
>>> ... that seems like just a bit overdone.
>> To my mind, the test is of less concern, compared to the issue itself.
>>
>> I would prefer to keep the test catching this bug for now, unless that
>> causes severe issues for maintainers.  If that happens, we could add a
>> workaround.
> Actually, let me rephrase myself.
>
> I think that we _should not_ change the test, because it does what it's
> supposed to do (catch bugs).  In this case, maintainers would be better
> using official sqlite-amalgamation, until the bug in the customized SQLite
> library shipped with OS X is fixed.  The failing test is a proper indication
> of this bug.

Ack. It's going to be painful if I ever update the OSX build slave, but
you're right.

In the meantime, I've verified that, Homebrew's build does not use the
stock SQLite on the Mac but Homebrew's own version, which does not have
this problem. I don't know about Fink and other installers.

-- Brane

Re: Failed fs-test 65 (Was: Re: 1.9.5 up for signing/testing)

Posted by Evgeny Kotkov <ev...@visualsvn.com>.
Evgeny Kotkov <ev...@visualsvn.com> writes:

>> The question is: what do we do about it? Complaining to Apple isn't
>> likely to help. We could add a special case in the testcase for that
>> version of SQLite on OSX, just to keep the test output in the green. But
>> ... that seems like just a bit overdone.
>
> To my mind, the test is of less concern, compared to the issue itself.
>
> I would prefer to keep the test catching this bug for now, unless that
> causes severe issues for maintainers.  If that happens, we could add a
> workaround.

Actually, let me rephrase myself.

I think that we _should not_ change the test, because it does what it's
supposed to do (catch bugs).  In this case, maintainers would be better
using official sqlite-amalgamation, until the bug in the customized SQLite
library shipped with OS X is fixed.  The failing test is a proper indication
of this bug.


Regards,
Evgeny Kotkov

Re: Failed fs-test 65 (Was: Re: 1.9.5 up for signing/testing)

Posted by Evgeny Kotkov <ev...@visualsvn.com>.
Julian Foad <ju...@apache.org> writes:

> Just a drive-by thought: Should we report this to SQLite? Especially if by
> distilling the Subversion test we could write a SQLite self-test. I recall
> the SQLite team is big on thorough regression testing and so would likely
> want to know about this.
>
> (They then might have some influence with the Apple folks, or maybe not, but
> that's for them to deal with. It won't change our situation in the short
> time with regard to this particular version.)

I will take care of reporting this to SQLite authors.

This is quite an issue, as this bug caused by a custom patch makes SQLite
(a _derivative_ distributed under the name of SQLite, to be precise) unusable
in a typical reader-writer case.  In the described case, the vanilla version
would have blocked in a retry loop allowing for the reader to finish its work
and notifying the API user via xBusyHandler(), but the patch causes it to
fail immediately, and this prevents concurrency.

Branko Čibej <br...@apache.org> writes:

> The question is: what do we do about it? Complaining to Apple isn't
> likely to help. We could add a special case in the testcase for that
> version of SQLite on OSX, just to keep the test output in the green. But
> ... that seems like just a bit overdone.

To my mind, the test is of less concern, compared to the issue itself.

I would prefer to keep the test catching this bug for now, unless that
causes severe issues for maintainers.  If that happens, we could add a
workaround.


Regards,
Evgeny Kotkov

Re: Failed fs-test 65 (Was: Re: 1.9.5 up for signing/testing)

Posted by Julian Foad <ju...@apache.org>.
Branko \u010cibej wrote:
> To be quite candid, I'm not surprised ... this wouldn't be the first
> time that Apple messed up its patches of perfectly good upstream
> software. :(
>
> The question is: what do we do about it? Complaining to Apple isn't
> likely to help. We could add a special case in the testcase for that
> version of SQLite on OSX, just to keep the test output in the green. But
> ... that seems like just a bit overdone.

Just a drive-by thought: Should we report this to SQLite? Especially if 
by distilling the Subversion test we could write a SQLite self-test. I 
recall the SQLite team is big on thorough regression testing and so 
would likely want to know about this.

(They then might have some influence with the Apple folks, or maybe not, 
but that's for them to deal with. It won't change our situation in the 
short time with regard to this particular version.)

- Julian

Re: Failed fs-test 65 (Was: Re: 1.9.5 up for signing/testing)

Posted by Branko Čibej <br...@apache.org>.
On 24.11.2016 00:05, Evgeny Kotkov wrote:
> Branko \u010cibej <br...@apache.org> writes:
>
>> It is reproducible whether run as a single test, all of fs-test or the
>> whole test suite; but only with SQLite 3.8.10.2. I suspect it is a test
>> bug, but haven't followed up; could be due to a bug in SQLite itself,
>> e.g., that the older version returns the wrong error code.
>>
>> The only potential problem as far as I can see is that 3.8.10.2 is the
>> version of SQLite shipped with OSX \u2014 hence, the version that most
>> binaries will be linking with.
> Well, it gets interesting.
>
> I can reproduce the failure under OS X 10.11 with bundled SQLite 3.8.10.2.
> However, this failure is specific to SQLite that is shipped with OS X.
> Official versions (including 3.8.10.2 from [1]) return SQLITE_BUSY,
> as promised in the documentation, and the test passes.
>
> `dtrace` shows that the version shipped with OS X is heavily patched.
> There's a different syscall pattern: guarded_open_np/guarded_pwrite_np/
> guarded_close_np instead of open/write/close, pread instead of lseek+read,
> different fcntl calls, etc.  The sqlite3.h headers are also a bit different.
>
> I would guess that SQLite version shipped with OS X has a custom VFS
> implementation, which doesn't keep some of the promises of the official
> version, or contains a bug that results in an I/O error instead of expected
> SQLITE_BUSY error.
>
> Could you please confirm this by running the test with the official 3.8.10.2
> from [1]?
>
> [1] https://sqlite.org/2015/sqlite-amalgamation-3081002.zip

Indeed, fs-test passes with the stock SQLite. Running all the tests now
just to make sure there's no quirk elsewhere.

To be quite candid, I'm not surprised ... this wouldn't be the first
time that Apple messed up its patches of perfectly good upstream
software. :(


The question is: what do we do about it? Complaining to Apple isn't
likely to help. We could add a special case in the testcase for that
version of SQLite on OSX, just to keep the test output in the green. But
... that seems like just a bit overdone.

-- Brane

Re: Failed fs-test 65 (Was: Re: 1.9.5 up for signing/testing)

Posted by Evgeny Kotkov <ev...@visualsvn.com>.
Branko Čibej <br...@apache.org> writes:

> It is reproducible whether run as a single test, all of fs-test or the
> whole test suite; but only with SQLite 3.8.10.2. I suspect it is a test
> bug, but haven't followed up; could be due to a bug in SQLite itself,
> e.g., that the older version returns the wrong error code.
>
> The only potential problem as far as I can see is that 3.8.10.2 is the
> version of SQLite shipped with OSX — hence, the version that most
> binaries will be linking with.

Well, it gets interesting.

I can reproduce the failure under OS X 10.11 with bundled SQLite 3.8.10.2.
However, this failure is specific to SQLite that is shipped with OS X.
Official versions (including 3.8.10.2 from [1]) return SQLITE_BUSY,
as promised in the documentation, and the test passes.

`dtrace` shows that the version shipped with OS X is heavily patched.
There's a different syscall pattern: guarded_open_np/guarded_pwrite_np/
guarded_close_np instead of open/write/close, pread instead of lseek+read,
different fcntl calls, etc.  The sqlite3.h headers are also a bit different.

I would guess that SQLite version shipped with OS X has a custom VFS
implementation, which doesn't keep some of the promises of the official
version, or contains a bug that results in an I/O error instead of expected
SQLITE_BUSY error.

Could you please confirm this by running the test with the official 3.8.10.2
from [1]?

[1] https://sqlite.org/2015/sqlite-amalgamation-3081002.zip


Thanks,
Evgeny Kotkov

Re: Failed fs-test 65 (Was: Re: 1.9.5 up for signing/testing)

Posted by Branko Čibej <br...@apache.org>.
On 23.11.2016 13:31, Evgeny Kotkov wrote:
> Branko \u010cibej <br...@apache.org> writes:
>
>> New issues:
>>
>>   - fs-test 65 fails with SQLite 3.8.10.2:
>>     [[[
>>     svn_tests: E200006: Expected error SVN_ERR_SQLITE_BUSY but got
>>       SVN_ERR_SQLITE_ERROR
>>     svn_tests: E200030: sqlite[S10]: disk I/O error
>>     svn_tests: E200042: Additional errors:
>>     svn_tests: E200030: sqlite[S10]: disk I/O error
>>     svn_tests: E200044: SQLite transaction rollback failed
>>     svn_tests: E200030: sqlite[S1]: cannot rollback - no transaction is active
>>     svn_tests: E200030: sqlite[S1]: cannot rollback - no transaction is active
>>     FAIL:  fs-test 65: test commit with locked rep-cache
>>     ]]]
> Hi Branko,
>
> Is this failure reproducible?  Does it happen if you run just fs-test#65?
>
> I tried to witness the failure on my Windows and Linux machines with
> SQLite 3.8.10.2, but the test passes for me (the relevant sqlite-test#2
> also works fine).

It is reproducible whether run as a single test, all of fs-test or the
whole test suite; but only with SQLite 3.8.10.2. I suspect it is a test
bug, but haven't followed up; could be due to a bug in SQLite itself,
e.g., that the older version returns the wrong error code.

The only potential problem as far as I can see is that 3.8.10.2 is the
version of SQLite shipped with OSX \u2014 hence, the version that most
binaries will be linking with.

-- Brane