You are viewing a plain text version of this content. The canonical link for it is here.
Posted to slide-dev@jakarta.apache.org by Torgeir Veimo <to...@vertech.no> on 2001/12/05 17:37:12 UTC
delta diff implementation
I have the first prototype ready for a java based implementation of Josh
MacDonalds xdelta binary diff/delta algorithm. See
http://sourceforge.net/projects/xdelta for more information on this
algorithm.
It is available for download here; http://www.vertech.no/java-xdelta.tar.gz
This implementation uses the GDIFF file format for storing binary
deltas. There is also a patcher that takes a source file and a patch
file in this format and produces the original file.
When this implementation is mature it might be usefull if someone wants
to implement a versioned store implementation for slide with reduced
physical storage requirements.
I am seeking feedback on this. Warning; this is pre-alpha quality code.
--
-Torgeir
--
To unsubscribe, e-mail: <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>
Re: delta diff implementation
Posted by Torgeir Veimo <to...@vertech.no>.
Kelvin Tan wrote:
> I couldn't resist peeking...:)
> Why did you choose to implement your own implementation of the Adler32 algo
> in Checksum.java? What about java.util.zip.Adler32?
This is still the weakest part of the implementation. The reason was
that I wanted to try to use the update method, so that only one byte
goes into the adler checksum, instead of regenerating the entire 16 byte
checksum for each new byte.
--
-Torgeir
--
To unsubscribe, e-mail: <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>
Re: delta diff implementation
Posted by Kelvin Tan <ke...@relevanz.com>.
Torgeir,
I couldn't resist peeking...:)
Why did you choose to implement your own implementation of the Adler32 algo
in Checksum.java? What about java.util.zip.Adler32?
Kelvin
----- Original Message -----
From: Torgeir Veimo <to...@vertech.no>
To: Slide Developers List <sl...@jakarta.apache.org>
Sent: Thursday, December 06, 2001 7:36 PM
Subject: Re: delta diff implementation
> Remy Maucherat wrote:
>
> >>I have the first prototype ready for a java based implementation of Josh
> >>MacDonalds xdelta binary diff/delta algorithm. See
> >>http://sourceforge.net/projects/xdelta for more information on this
> >>algorithm.
> >>
> >>It is available for download here;
> >>
> > http://www.vertech.no/java-xdelta.tar.gz
> >
> >>This implementation uses the GDIFF file format for storing binary
> >>deltas. There is also a patcher that takes a source file and a patch
> >>file in this format and produces the original file.
> >>
> >>When this implementation is mature it might be usefull if someone wants
> >>to implement a versioned store implementation for slide with reduced
> >>physical storage requirements.
> >>
> >>I am seeking feedback on this. Warning; this is pre-alpha quality code.
> >>
> >
> > Obviously, that's a very useful library. I'll try to play with it :)
>
>
> Some notes; as you'll see, the source 'file' needs to be seekable, so I
> use a RandomAccessFile in this implementation. This is a property of it
> being a copy-insert algorithm, and not a delete-insert like vanilla
> diff. It is possible to create an implementation that doesn't the source
> doesn't need to be seekable. But it *has* to be readable twice.
>
> There are implication eg. if one wants to store the last revision as the
> full one, and the previous as a delta; since they last revision might be
> available as an input stream from a http put. One way of doing it would
> be to write to the store first, and at the same time generate the hash
> fingerprints, then do the delta run on the source from the store.
> Reading the source from the store would probably imply that it is
> seekable to?
>
> When patching, the source needs to be seekable as well, due to the same
> reason as above.
>
> It is not very space optimized yet, as it uses a hashtable. I will use
> an array as the hashtable in the next release, as the algorithm really
> was designed for this.
>
> The GDIFFWriter could really be a special outputstream. The same goes
> for the patcher; it could be an inputstream that took the source input
> and the delta input as parameters.
>
> I have tried the algorithm on different input data, but it obviously
> needs a lot of testing. It would also be nice if someone could proof
> read my io code; I'm not shure I got all the signed / unsigned stuff
right.
>
> --
> -Torgeir
>
>
> --
> To unsubscribe, e-mail:
<ma...@jakarta.apache.org>
> For additional commands, e-mail:
<ma...@jakarta.apache.org>
>
>
--
To unsubscribe, e-mail: <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>
Re: delta diff implementation
Posted by Torgeir Veimo <to...@vertech.no>.
Remy Maucherat wrote:
>>I have the first prototype ready for a java based implementation of Josh
>>MacDonalds xdelta binary diff/delta algorithm. See
>>http://sourceforge.net/projects/xdelta for more information on this
>>algorithm.
>>
>>It is available for download here;
>>
> http://www.vertech.no/java-xdelta.tar.gz
>
>>This implementation uses the GDIFF file format for storing binary
>>deltas. There is also a patcher that takes a source file and a patch
>>file in this format and produces the original file.
>>
>>When this implementation is mature it might be usefull if someone wants
>>to implement a versioned store implementation for slide with reduced
>>physical storage requirements.
>>
>>I am seeking feedback on this. Warning; this is pre-alpha quality code.
>>
>
> Obviously, that's a very useful library. I'll try to play with it :)
Some notes; as you'll see, the source 'file' needs to be seekable, so I
use a RandomAccessFile in this implementation. This is a property of it
being a copy-insert algorithm, and not a delete-insert like vanilla
diff. It is possible to create an implementation that doesn't the source
doesn't need to be seekable. But it *has* to be readable twice.
There are implication eg. if one wants to store the last revision as the
full one, and the previous as a delta; since they last revision might be
available as an input stream from a http put. One way of doing it would
be to write to the store first, and at the same time generate the hash
fingerprints, then do the delta run on the source from the store.
Reading the source from the store would probably imply that it is
seekable to?
When patching, the source needs to be seekable as well, due to the same
reason as above.
It is not very space optimized yet, as it uses a hashtable. I will use
an array as the hashtable in the next release, as the algorithm really
was designed for this.
The GDIFFWriter could really be a special outputstream. The same goes
for the patcher; it could be an inputstream that took the source input
and the delta input as parameters.
I have tried the algorithm on different input data, but it obviously
needs a lot of testing. It would also be nice if someone could proof
read my io code; I'm not shure I got all the signed / unsigned stuff right.
--
-Torgeir
--
To unsubscribe, e-mail: <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>
Re: delta diff implementation
Posted by Remy Maucherat <re...@apache.org>.
> I have the first prototype ready for a java based implementation of Josh
> MacDonalds xdelta binary diff/delta algorithm. See
> http://sourceforge.net/projects/xdelta for more information on this
> algorithm.
>
> It is available for download here;
http://www.vertech.no/java-xdelta.tar.gz
>
> This implementation uses the GDIFF file format for storing binary
> deltas. There is also a patcher that takes a source file and a patch
> file in this format and produces the original file.
>
> When this implementation is mature it might be usefull if someone wants
> to implement a versioned store implementation for slide with reduced
> physical storage requirements.
>
> I am seeking feedback on this. Warning; this is pre-alpha quality code.
Obviously, that's a very useful library. I'll try to play with it :)
Remy
--
To unsubscribe, e-mail: <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>