You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Stefan Sperling <st...@apache.org> on 2014/09/28 23:17:50 UTC

Re: svn commit: r1628093 - in /subversion/trunk/subversion/libsvn_fs_fs: index.c structure-indexes

On Sun, Sep 28, 2014 at 05:56:01PM -0000, stefan2@apache.org wrote:
> Author: stefan2
> Date: Sun Sep 28 17:56:01 2014
> New Revision: 1628093
> 
> URL: http://svn.apache.org/r1628093
> Log:
> Support FSFS format 7 commits in load balanced mixed-architecture clusters.
> 
> At least theoretically, machines with different endianess or off_t sizes
> might access the same repository on e.g. an iSCSI SAN.  Then the machine
> performing a commit may have a different architecture from the one building
> up the transaction.  To allow that, even the intermediate (proto-) index
> format within those transactions must be platform-independent.

Hi Stefan,

If you describe in some more detail how to set it up I can test this
scenario for you on OpenBSD with sparc64 (64bit big-endian), amd64
(64bit little endian), macppc (32bit big-endian) and i386 (32bit little
endian) machines.

Re: svn commit: r1628093 - in /subversion/trunk/subversion/libsvn_fs_fs: index.c structure-indexes

Posted by Stefan Fuhrmann <st...@wandisco.com>.
On Thu, Oct 16, 2014 at 4:25 PM, Stefan Sperling <st...@elego.de> wrote:

> On Tue, Sep 30, 2014 at 11:44:04AM +0200, Stefan Fuhrmann wrote:
> > On Tue, Sep 30, 2014 at 11:15 AM, Stefan Sperling <st...@apache.org>
> wrote:
> > > How do you open a transaction and postpone the commit?
> > > Using some custom code written against the FS API?
> > >
> >
> > It would require some custom code like "create greek tree,
> > create txn, modify a few nodes" on one side and "open the
> > only available txn, commit txn" on the other side.
>
> Do you have some example or starting point for that somewhere?
>

I attached a test that should do the right thing.
Execute the first two steps on separate architectures
and the 3rd one on an arch of your choice.


> > > Or can some tool such as svnmucc already do this?
> > >
> >
> > svnadmin can only list and remove txns. svnmucc
> >
>
> svnmucc what? :)
>

Hm ... don't quite remember. Probably something
along the line of "starts, builds up and commits
a txn in a single call". Basically, hard to use for
the purpose at hand.


> > > I presume you rely on apr_off_t, not off_t, right?
> >
> >
> > Yes, I always use apr_off_t. On my system, APR typedefs
> > it as off_t.
>
> off_t is always 64bit on OpenBSD, so I could only test little/big endian
> variance. Unless perhaps if I patched APR to use a 32bit type for off_t.
>

It would certainly be interesting to a) find systems
in the wild that still uses 32 bit off_t and b) for us to
check whether we still work on them.


> Would it be possible to test this in our regression test suite somehow?
>

That's probably hard unless we find a setup with
genuine 32 bit off_t because APR uses it directly
with lseek() and friends. So, we can't just redefine it.
Even the ILP32 bb-openbsd buildbot has 64 off_t.

-- Stefan^2.

Re: svn commit: r1628093 - in /subversion/trunk/subversion/libsvn_fs_fs: index.c structure-indexes

Posted by Stefan Sperling <st...@elego.de>.
On Tue, Sep 30, 2014 at 11:44:04AM +0200, Stefan Fuhrmann wrote:
> On Tue, Sep 30, 2014 at 11:15 AM, Stefan Sperling <st...@apache.org> wrote:
> > How do you open a transaction and postpone the commit?
> > Using some custom code written against the FS API?
> >
> 
> It would require some custom code like "create greek tree,
> create txn, modify a few nodes" on one side and "open the
> only available txn, commit txn" on the other side.

Do you have some example or starting point for that somewhere?

> > Or can some tool such as svnmucc already do this?
> >
> 
> svnadmin can only list and remove txns. svnmucc
> 

svnmucc what? :)

> > I presume you rely on apr_off_t, not off_t, right?
> 
> 
> Yes, I always use apr_off_t. On my system, APR typedefs
> it as off_t.

off_t is always 64bit on OpenBSD, so I could only test little/big endian
variance. Unless perhaps if I patched APR to use a 32bit type for off_t.

Would it be possible to test this in our regression test suite somehow?

Re: svn commit: r1628093 - in /subversion/trunk/subversion/libsvn_fs_fs: index.c structure-indexes

Posted by Stefan Fuhrmann <st...@wandisco.com>.
On Tue, Sep 30, 2014 at 11:15 AM, Stefan Sperling <st...@apache.org> wrote:

> On Mon, Sep 29, 2014 at 07:18:05PM +0200, Stefan Fuhrmann wrote:
> > On Sun, Sep 28, 2014 at 11:17 PM, Stefan Sperling <st...@apache.org>
> wrote:
> >
> > > On Sun, Sep 28, 2014 at 05:56:01PM -0000, stefan2@apache.org wrote:
> > > > Author: stefan2
> > > > Date: Sun Sep 28 17:56:01 2014
> > > > New Revision: 1628093
> > > >
> > > > URL: http://svn.apache.org/r1628093
> > > > Log:
> > > > Support FSFS format 7 commits in load balanced mixed-architecture
> > > clusters.
> > > >
> > > > At least theoretically, machines with different endianess or off_t
> sizes
> > > > might access the same repository on e.g. an iSCSI SAN.  Then the
> machine
> > > > performing a commit may have a different architecture from the one
> > > building
> > > > up the transaction.  To allow that, even the intermediate (proto-)
> index
> > > > format within those transactions must be platform-independent.
> > >
> > > Hi Stefan,
> > >
> > > If you describe in some more detail how to set it up I can test this
> > > scenario for you on OpenBSD with sparc64 (64bit big-endian), amd64
> > > (64bit little endian), macppc (32bit big-endian) and i386 (32bit little
> > > endian) machines.
> > >
> >
> > Thanks for the offer. So, this is the issue that I addressed
> > with the above patch:
> >
> > At the FS API level, transactions can be opened (and implicitly
> > closed again) many times before being committed. If these
> > operations are performed from different machines accessing
> > the same repository, they might not agree upon C struct size,
> > layout or interpretation. So, for pure API consistency alone,
> > all data within a txn needs to be portable / architecture independent.
> >
> > I may be wrong but I vaguely remember that the HTTP client
> > sends the "commit" operation over a separate connection that
> > the one it used to build up the txn. If there is a load balancer
> > in front of the actual servers, that commit may get served by
> > a different machine.
> >
> > Testing the general scenario is simpler, though. Have one machine
> > build up a transaction which touches a few of nodes (such that
> > different record sizes would result in missing / misaligned data).
> > Then e.g. copy the repo to a different machine and let that one
> > do the commit. The result must then pass 'svnadmin verify'.
> >
> > The critical combinations are little / big endian and 32 bit vs.
> > 64 bit file offsets (not the same thing as 32 / 64 bit in general).
> > So, 64 bit big endian vs. 32 bit little endian with 32 bit off_t
> > would probably cover it. Tests with both machines in both
> > roles (txn creation and txn commit).
> >
> > -- Stefan^2.
>
> How do you open a transaction and postpone the commit?
> Using some custom code written against the FS API?
>

It would require some custom code like "create greek tree,
create txn, modify a few nodes" on one side and "open the
only available txn, commit txn" on the other side.


> Or can some tool such as svnmucc already do this?
>

svnadmin can only list and remove txns. svnmucc


> I presume you rely on apr_off_t, not off_t, right?


Yes, I always use apr_off_t. On my system, APR typedefs
it as off_t.


> I.e. it's
> enough to recompile APR and SVN with or without large file
> support to switch between 32bit and 64bit apr_off_t?
>

It will be enough if it changes sizeof(apr_off_t).

-- Stefan^2.

Re: svn commit: r1628093 - in /subversion/trunk/subversion/libsvn_fs_fs: index.c structure-indexes

Posted by Branko Čibej <br...@wandisco.com>.
On 30.09.2014 11:15, Stefan Sperling wrote:
> On Mon, Sep 29, 2014 at 07:18:05PM +0200, Stefan Fuhrmann wrote:
>> On Sun, Sep 28, 2014 at 11:17 PM, Stefan Sperling <st...@apache.org> wrote:
>>
>>> On Sun, Sep 28, 2014 at 05:56:01PM -0000, stefan2@apache.org wrote:
>>>> Author: stefan2
>>>> Date: Sun Sep 28 17:56:01 2014
>>>> New Revision: 1628093
>>>>
>>>> URL: http://svn.apache.org/r1628093
>>>> Log:
>>>> Support FSFS format 7 commits in load balanced mixed-architecture
>>> clusters.
>>>> At least theoretically, machines with different endianess or off_t sizes
>>>> might access the same repository on e.g. an iSCSI SAN.  Then the machine
>>>> performing a commit may have a different architecture from the one
>>> building
>>>> up the transaction.  To allow that, even the intermediate (proto-) index
>>>> format within those transactions must be platform-independent.
>>> Hi Stefan,
>>>
>>> If you describe in some more detail how to set it up I can test this
>>> scenario for you on OpenBSD with sparc64 (64bit big-endian), amd64
>>> (64bit little endian), macppc (32bit big-endian) and i386 (32bit little
>>> endian) machines.
>>>
>> Thanks for the offer. So, this is the issue that I addressed
>> with the above patch:
>>
>> At the FS API level, transactions can be opened (and implicitly
>> closed again) many times before being committed. If these
>> operations are performed from different machines accessing
>> the same repository, they might not agree upon C struct size,
>> layout or interpretation. So, for pure API consistency alone,
>> all data within a txn needs to be portable / architecture independent.
>>
>> I may be wrong but I vaguely remember that the HTTP client
>> sends the "commit" operation over a separate connection that
>> the one it used to build up the txn. If there is a load balancer
>> in front of the actual servers, that commit may get served by
>> a different machine.
>>
>> Testing the general scenario is simpler, though. Have one machine
>> build up a transaction which touches a few of nodes (such that
>> different record sizes would result in missing / misaligned data).
>> Then e.g. copy the repo to a different machine and let that one
>> do the commit. The result must then pass 'svnadmin verify'.
>>
>> The critical combinations are little / big endian and 32 bit vs.
>> 64 bit file offsets (not the same thing as 32 / 64 bit in general).
>> So, 64 bit big endian vs. 32 bit little endian with 32 bit off_t
>> would probably cover it. Tests with both machines in both
>> roles (txn creation and txn commit).
>>
>> -- Stefan^2.
> How do you open a transaction and postpone the commit?
> Using some custom code written against the FS API?
> Or can some tool such as svnmucc already do this?

I think you have to write a tool that works agains the repos API. The
Python bindings should be good enough for that.

> I presume you rely on apr_off_t, not off_t, right? I.e. it's
> enough to recompile APR and SVN with or without large file
> support to switch between 32bit and 64bit apr_off_t?

AFAIR changing largefile support does not change the APR ABI since 1.0.

-- Brane

Re: svn commit: r1628093 - in /subversion/trunk/subversion/libsvn_fs_fs: index.c structure-indexes

Posted by Stefan Sperling <st...@apache.org>.
On Mon, Sep 29, 2014 at 07:18:05PM +0200, Stefan Fuhrmann wrote:
> On Sun, Sep 28, 2014 at 11:17 PM, Stefan Sperling <st...@apache.org> wrote:
> 
> > On Sun, Sep 28, 2014 at 05:56:01PM -0000, stefan2@apache.org wrote:
> > > Author: stefan2
> > > Date: Sun Sep 28 17:56:01 2014
> > > New Revision: 1628093
> > >
> > > URL: http://svn.apache.org/r1628093
> > > Log:
> > > Support FSFS format 7 commits in load balanced mixed-architecture
> > clusters.
> > >
> > > At least theoretically, machines with different endianess or off_t sizes
> > > might access the same repository on e.g. an iSCSI SAN.  Then the machine
> > > performing a commit may have a different architecture from the one
> > building
> > > up the transaction.  To allow that, even the intermediate (proto-) index
> > > format within those transactions must be platform-independent.
> >
> > Hi Stefan,
> >
> > If you describe in some more detail how to set it up I can test this
> > scenario for you on OpenBSD with sparc64 (64bit big-endian), amd64
> > (64bit little endian), macppc (32bit big-endian) and i386 (32bit little
> > endian) machines.
> >
> 
> Thanks for the offer. So, this is the issue that I addressed
> with the above patch:
> 
> At the FS API level, transactions can be opened (and implicitly
> closed again) many times before being committed. If these
> operations are performed from different machines accessing
> the same repository, they might not agree upon C struct size,
> layout or interpretation. So, for pure API consistency alone,
> all data within a txn needs to be portable / architecture independent.
> 
> I may be wrong but I vaguely remember that the HTTP client
> sends the "commit" operation over a separate connection that
> the one it used to build up the txn. If there is a load balancer
> in front of the actual servers, that commit may get served by
> a different machine.
> 
> Testing the general scenario is simpler, though. Have one machine
> build up a transaction which touches a few of nodes (such that
> different record sizes would result in missing / misaligned data).
> Then e.g. copy the repo to a different machine and let that one
> do the commit. The result must then pass 'svnadmin verify'.
> 
> The critical combinations are little / big endian and 32 bit vs.
> 64 bit file offsets (not the same thing as 32 / 64 bit in general).
> So, 64 bit big endian vs. 32 bit little endian with 32 bit off_t
> would probably cover it. Tests with both machines in both
> roles (txn creation and txn commit).
> 
> -- Stefan^2.

How do you open a transaction and postpone the commit?
Using some custom code written against the FS API?
Or can some tool such as svnmucc already do this?

I presume you rely on apr_off_t, not off_t, right? I.e. it's
enough to recompile APR and SVN with or without large file
support to switch between 32bit and 64bit apr_off_t?

Re: svn commit: r1628093 - in /subversion/trunk/subversion/libsvn_fs_fs: index.c structure-indexes

Posted by Stefan Fuhrmann <st...@wandisco.com>.
On Sun, Sep 28, 2014 at 11:17 PM, Stefan Sperling <st...@apache.org> wrote:

> On Sun, Sep 28, 2014 at 05:56:01PM -0000, stefan2@apache.org wrote:
> > Author: stefan2
> > Date: Sun Sep 28 17:56:01 2014
> > New Revision: 1628093
> >
> > URL: http://svn.apache.org/r1628093
> > Log:
> > Support FSFS format 7 commits in load balanced mixed-architecture
> clusters.
> >
> > At least theoretically, machines with different endianess or off_t sizes
> > might access the same repository on e.g. an iSCSI SAN.  Then the machine
> > performing a commit may have a different architecture from the one
> building
> > up the transaction.  To allow that, even the intermediate (proto-) index
> > format within those transactions must be platform-independent.
>
> Hi Stefan,
>
> If you describe in some more detail how to set it up I can test this
> scenario for you on OpenBSD with sparc64 (64bit big-endian), amd64
> (64bit little endian), macppc (32bit big-endian) and i386 (32bit little
> endian) machines.
>

Thanks for the offer. So, this is the issue that I addressed
with the above patch:

At the FS API level, transactions can be opened (and implicitly
closed again) many times before being committed. If these
operations are performed from different machines accessing
the same repository, they might not agree upon C struct size,
layout or interpretation. So, for pure API consistency alone,
all data within a txn needs to be portable / architecture independent.

I may be wrong but I vaguely remember that the HTTP client
sends the "commit" operation over a separate connection that
the one it used to build up the txn. If there is a load balancer
in front of the actual servers, that commit may get served by
a different machine.

Testing the general scenario is simpler, though. Have one machine
build up a transaction which touches a few of nodes (such that
different record sizes would result in missing / misaligned data).
Then e.g. copy the repo to a different machine and let that one
do the commit. The result must then pass 'svnadmin verify'.

The critical combinations are little / big endian and 32 bit vs.
64 bit file offsets (not the same thing as 32 / 64 bit in general).
So, 64 bit big endian vs. 32 bit little endian with 32 bit off_t
would probably cover it. Tests with both machines in both
roles (txn creation and txn commit).

-- Stefan^2.