You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Bo Chen <bo...@gmail.com> on 2013/03/06 05:49:06 UTC

some questions about the delta operation in SVN

Can anyone help me make clear the following questions? Thanks very much.

I make some updates, and the SVN client generates the delta and sends it to
the SVN server. Does the server simply store this delta to the repository,
or do something more?

Sometimes I find the SVN client does not delta the updated version with the
latest version in the client. Is there any rule under which the client
decides which pristine version to be delta-ed against, or the client just
randomly chooses a pristine version to delta against?

Bo

Re: some questions about the delta operation in SVN

Posted by Ben Reser <be...@reser.org>.
On Wed, Mar 6, 2013 at 7:24 AM, Daniel Shahaf <d....@daniel.shahaf.name> wrote:
> See notes/skip-deltas in trunk.

Things are a little bit more complicated in trunk than what's in
notes/skip-deltas for fsfs because we now have some knobs that let you
adjust how the skip deltas behave.  This changes the default behavior
of a newly created repo in trunk versus the behavior in previous
versions.  If you're interested in the way trunk behaves you should
also look at the db/fsfs.conf file under a repository created with a
trunk svnadmin and look at the deltification section.

Re: some questions about the delta operation in SVN

Posted by Daniel Shahaf <d....@daniel.shahaf.name>.
Daniel Shahaf wrote on Wed, Mar 06, 2013 at 17:24:46 +0200:
> Bo Chen wrote on Wed, Mar 06, 2013 at 10:14:41 -0500:
> > I am very curious why the server needs to re-compute the skip-delta. Is
> > there any rule to guide the server which pristine version to be delta-ed
> > against? To optimize the delta (specifically, to optimize the storage for
> > the delta)?
> > 
> 
> See notes/skip-deltas in trunk.  The delta base is not the immediate
> previous revision in order to reduce the delta chain length needed for
> reconstructing a random revision of the file from O(log H) to O(H), for

typo: should be "from O(H) to O(log H)"

> a file with H revisions.

Re: some questions about the delta operation in SVN

Posted by Daniel Shahaf <d....@daniel.shahaf.name>.
Bo Chen wrote on Wed, Mar 06, 2013 at 10:14:41 -0500:
> I am very curious why the server needs to re-compute the skip-delta. Is
> there any rule to guide the server which pristine version to be delta-ed
> against? To optimize the delta (specifically, to optimize the storage for
> the delta)?
> 

See notes/skip-deltas in trunk.  The delta base is not the immediate
previous revision in order to reduce the delta chain length needed for
reconstructing a random revision of the file from O(log H) to O(H), for
a file with H revisions.

Re: some questions about the delta operation in SVN

Posted by Stefan Fuhrmann <st...@wandisco.com>.
On Wed, Mar 6, 2013 at 4:14 PM, Bo Chen <bo...@gmail.com> wrote:

> Clarify one point: The file@base (or the file @head) refers to the file I
> am currently updating, right?
>
> I am very curious why the server needs to re-compute the skip-delta. Is
> there any rule to guide the server which pristine version to be delta-ed
> against? To optimize the delta (specifically, to optimize the storage for
> the delta)?
>
> Thanks.
>
> Bo
>

Hi Bo,

To prevent any confusion, I want to point out that there are
two *independent* places where deltas are being applied:

(1) Between client and server.
    This is to conserve network bandwidth and is optional
    (depending on the protocol, they may simply send fulltext).
    The delta base is always the latest version that client has,
    i.e. BASE. In case of an update, the client tells the server
    what the respective BASE revision is ("I'm on rev 42 for
    sub-tree X/Y. Please send me the data for revision 59.")

    Data sent from the client to the server is *always* fully
    reconstructed from the incoming delta. This is necessary
    to calculate and verify the MD5 / SHA1 checksums. All
    of this happens "streamy", i.e. the data gets reconstructed
    and processed *while* coming in. There is no temporary
    file on the server eventually containing the full file contents.

    Data sent from the server to the client always starts as
    a fulltext read from the repository. If the client has already
    another version of that file and the protocol supports deltas,
    the server will read that other revision from the repository, too,
    and then calculate the delta while sending it streamingly to
    the client.

(2) When the server writes data to the repository, it starts of with
    some fulltext coming in and *may* choose to deltify the new
    contents against some existing contents.

    This is done to conserve disk space and results in a chain of
    deltas that *all* need to be read and combined to reconstruct
    the fulltext. As Ben already pointed out, 1.8 has a number of
    tuning knobs that allow you to shift the balance between data
    size (small deltas) and reconstruction effort (number of deltas
    to read and process for a given fulltext).

-- Stefan^2.

-- 
Certified & Supported Apache Subversion Downloads:
*

http://www.wandisco.com/subversion/download
*

Re: some questions about the delta operation in SVN

Posted by Bo Chen <bo...@gmail.com>.
Clarify one point: The file@base (or the file @head) refers to the file I
am currently updating, right?

I am very curious why the server needs to re-compute the skip-delta. Is
there any rule to guide the server which pristine version to be delta-ed
against? To optimize the delta (specifically, to optimize the storage for
the delta)?

Thanks.

Bo

On Wed, Mar 6, 2013 at 7:05 AM, Daniel Shahaf <d....@daniel.shahaf.name>wrote:

> Branko Čibej wrote on Wed, Mar 06, 2013 at 06:41:40 +0100:
> > On 06.03.2013 06:21, Daniel Shahaf wrote:
> > > Bo Chen wrote on Tue, Mar 05, 2013 at 23:49:06 -0500:
> > >> Can anyone help me make clear the following questions? Thanks very
> much.
> > >>
> > >> I make some updates, and the SVN client generates the delta and sends
> it to
> > >> the SVN server. Does the server simply store this delta to the
> repository,
> > >> or do something more?
> > >>
> > > The latter.  The client always generates a delta against file.c@HEAD,
> > > but the filesystem stores skip-deltas.
> >
> > That would be file@BASE, since that's what the client has a pristine
> > version of. :)
>
> It @BASE and @HEAD will be the same node-rev, else the commit will fail.
>

Re: some questions about the delta operation in SVN

Posted by Daniel Shahaf <d....@daniel.shahaf.name>.
Branko Čibej wrote on Wed, Mar 06, 2013 at 06:41:40 +0100:
> On 06.03.2013 06:21, Daniel Shahaf wrote:
> > Bo Chen wrote on Tue, Mar 05, 2013 at 23:49:06 -0500:
> >> Can anyone help me make clear the following questions? Thanks very much.
> >>
> >> I make some updates, and the SVN client generates the delta and sends it to
> >> the SVN server. Does the server simply store this delta to the repository,
> >> or do something more?
> >>
> > The latter.  The client always generates a delta against file.c@HEAD,
> > but the filesystem stores skip-deltas.
> 
> That would be file@BASE, since that's what the client has a pristine
> version of. :)

It @BASE and @HEAD will be the same node-rev, else the commit will fail.

Re: some questions about the delta operation in SVN

Posted by Branko Čibej <br...@wandisco.com>.
On 06.03.2013 06:21, Daniel Shahaf wrote:
> Bo Chen wrote on Tue, Mar 05, 2013 at 23:49:06 -0500:
>> Can anyone help me make clear the following questions? Thanks very much.
>>
>> I make some updates, and the SVN client generates the delta and sends it to
>> the SVN server. Does the server simply store this delta to the repository,
>> or do something more?
>>
> The latter.  The client always generates a delta against file.c@HEAD,
> but the filesystem stores skip-deltas.

That would be file@BASE, since that's what the client has a pristine
version of. :)

-- 
Branko Čibej
Director of Subversion | WANdisco | www.wandisco.com


Re: some questions about the delta operation in SVN

Posted by Daniel Shahaf <d....@daniel.shahaf.name>.
Bo Chen wrote on Tue, Mar 05, 2013 at 23:49:06 -0500:
> Can anyone help me make clear the following questions? Thanks very much.
> 
> I make some updates, and the SVN client generates the delta and sends it to
> the SVN server. Does the server simply store this delta to the repository,
> or do something more?
> 

The latter.  The client always generates a delta against file.c@HEAD,
but the filesystem stores skip-deltas.

> Sometimes I find the SVN client does not delta the updated version with the
> latest version in the client. Is there any rule under which the client
> decides which pristine version to be delta-ed against, or the client just
> randomly chooses a pristine version to delta against?
> 

I don't understand what your issue is.  Are you able to describe it in
user-visible or API-consumer-visible terms?  The choice of delta base
(in both client and server) is entirely an implementation detail.

> Bo