You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@subversion.apache.org by jc...@apache.org on 2010/11/21 02:06:07 UTC

svn commit: r1037364 - /subversion/branches/diff-optimizations-bytes/BRANCH-README

Author: jcorvel
Date: Sun Nov 21 01:06:06 2010
New Revision: 1037364

URL: http://svn.apache.org/viewvc?rev=1037364&view=rev
Log:
On the diff-optimizations-bytes branch:

Add a BRANCH-README

Added:
    subversion/branches/diff-optimizations-bytes/BRANCH-README

Added: subversion/branches/diff-optimizations-bytes/BRANCH-README
URL: http://svn.apache.org/viewvc/subversion/branches/diff-optimizations-bytes/BRANCH-README?rev=1037364&view=auto
==============================================================================
--- subversion/branches/diff-optimizations-bytes/BRANCH-README (added)
+++ subversion/branches/diff-optimizations-bytes/BRANCH-README Sun Nov 21 01:06:06 2010
@@ -0,0 +1,19 @@
+The purpose of this branch is to experiment with speeding up 'svn diff',
+especially for large files with lots of unchanged lines.
+
+As a secondary objective, this should also speed up 'svn blame', since blame 
+performs a diff on the client side for every revision part of the blame 
+operation. This will only be noticeable if the server and network are fast
+enough, so the client becomes the bottleneck (e.g. on a LAN, server having a
+fast backend (e.g. FSFS on SSD)).
+
+General approach: reduce the problem set for the LCS algorithm as much as
+possible, by eliminating identical prefix and suffix before putting the
+tokens (lines) into the token tree (see [1] for some background).
+
+Specific approach for this branch: scan for identical prefix/suffix
+byte-per-byte, until a difference is found. This is done immediately after
+opening the datasources, before getting the tokens (lines) and inserting them
+into the token tree.
+
+[1] http://en.wikipedia.org/wiki/Longest_common_subsequence_problem#Reduce_the_problem_set
\ No newline at end of file