You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@subversion.apache.org by ju...@apache.org on 2015/01/26 19:28:34 UTC
svn commit: r1654857 - in /subversion/branches/dump-load-cross-check/subversion/tests/cmdline: svnadmin_tests.py svntest/sandbox.py svntest/testcase.py svntest/verify.py

Author: julianfoad
Date: Mon Jan 26 18:28:33 2015
New Revision: 1654857

URL: http://svn.apache.org/r1654857
Log:
Implement dump/load cross-checking in the test suite.

After every Python test in the test suite, dump the created repository (if
any) with svnadmin and with svnrdump and compare the two. Load each dump
with the other tool to check that they can load each other's dumps and
interpret them the same way. Run each dump through a no-op svndumpfilter
command and check it doesn't change the dumpfile significantly.

If cross-checking fails, raise a test failure in the normal way, as if the
cross-checking were part of the test.

A few tests deliberately create a broken repository. Adjust those tests
either to delete or to repair the repository before exiting, so that we do
not try to run this cross-checking on a broken repository.

Motivation: Cross-checking reveals bugs and inconsistencies between the
different dump/load tools. The test suite by its nature provides a good
selection of interesting repository contents.

### TODO:
  - Put back the dumpfile parser's ability to check for well-formed dumpfile
    contents, which has been temporarily stripped out here in order to
    achieve more generic parsing, as some dump/load tests rely on that.

  - Bypass authz when dumping with svnrdump, otherwise a partial dump is
    obtained for tests using authz, making those tests fail.

  - Make optional -- enabled by a configure option?

Ideas for improvement:
  - Improve the logic for finding repositories created by a test: detect
    when a test created a repository even if the sandbox is not marked as
    'built'; detect when a test created additional repositories.

  - Implement the same cross-checking for the C tests.

* subversion/tests/cmdline/svntest/sandbox.py
  (Sandbox.__init__): Remember the current working directory.
  (Sandbox.verify_repo,
   Sandbox.verify): New methods implementing dump/load cross-checking.

* subversion/tests/cmdline/svntest/testcase.py
  (FunctionTestCase.run): Call the sandbox's verify method afterwards.

* subversion/tests/cmdline/svntest/verify.py
  (DumpParser): Improve the parsing of headers, fixing bugs and making it
    more generic. Don't store the number of blank lines, for now, because at
    the moment these often differ between svnadmin and svnrdump.
  (compare_dump_files): Add new options: 'ignore uuid',
    'expect_content_length_always', 'ignore_empty_prop_sections'.

* subversion/tests/cmdline/svnadmin_tests.py
  (fsfs_recover_handle_missing_revs_or_revprops_file): Restore the
    repository to a non-corrupt state, to avoid failing to dump.
  (verify_keep_going,
   verify_invalid_path_changes,
   verify_quickly): Remove the corrupt repository, to avoid failing to dump.

Modified:
    subversion/branches/dump-load-cross-check/subversion/tests/cmdline/svnadmin_tests.py
    subversion/branches/dump-load-cross-check/subversion/tests/cmdline/svntest/sandbox.py
    subversion/branches/dump-load-cross-check/subversion/tests/cmdline/svntest/testcase.py
    subversion/branches/dump-load-cross-check/subversion/tests/cmdline/svntest/verify.py

Modified: subversion/branches/dump-load-cross-check/subversion/tests/cmdline/svnadmin_tests.py
URL: http://svn.apache.org/viewvc/subversion/branches/dump-load-cross-check/subversion/tests/cmdline/svnadmin_tests.py?rev=1654857&r1=1654856&r2=1654857&view=diff
==============================================================================
--- subversion/branches/dump-load-cross-check/subversion/tests/cmdline/svnadmin_tests.py (original)
+++ subversion/branches/dump-load-cross-check/subversion/tests/cmdline/svnadmin_tests.py Mon Jan 26 18:28:33 2015
@@ -1238,6 +1238,10 @@ def fsfs_recover_handle_missing_revs_or_
     ".*Revision 3 has a non-file where its revprops file should be.*"):
     raise svntest.Failure
 
+  # Restore the r3 revprops file, thus repairing the repository.
+  os.rmdir(revprop_3)
+  os.rename(revprop_was_3, revprop_3)
+
 
 #----------------------------------------------------------------------
 
@@ -2151,6 +2155,9 @@ def verify_keep_going(sbox):
                                    None, errput, None, "svnadmin: E165011:.*"):
     raise svntest.Failure
 
+  # Don't leave a corrupt repository
+  svntest.main.safe_rmtree(sbox.repo_dir, True)
+
 @SkipUnless(svntest.main.is_fs_type_fsfs)
 def verify_invalid_path_changes(sbox):
   "detect invalid changed path list entries"
@@ -2290,6 +2297,9 @@ def verify_invalid_path_changes(sbox):
                                    None, errput, None, "svnadmin: E165011:.*"):
     raise svntest.Failure
 
+  # Don't leave a corrupt repository
+  svntest.main.safe_rmtree(sbox.repo_dir, True)
+
 
 def verify_denormalized_names(sbox):
   "detect denormalized names and name collisions"
@@ -2669,6 +2679,9 @@ def verify_quickly(sbox):
                                    output, errput, exp_out, exp_err):
     raise svntest.Failure
 
+  # Don't leave a corrupt repository
+  svntest.main.safe_rmtree(sbox.repo_dir, True)
+
 
 @SkipUnless(svntest.main.is_fs_type_fsfs)
 @SkipUnless(svntest.main.fs_has_pack)

Modified: subversion/branches/dump-load-cross-check/subversion/tests/cmdline/svntest/sandbox.py
URL: http://svn.apache.org/viewvc/subversion/branches/dump-load-cross-check/subversion/tests/cmdline/svntest/sandbox.py?rev=1654857&r1=1654856&r2=1654857&view=diff
==============================================================================
--- subversion/branches/dump-load-cross-check/subversion/tests/cmdline/svntest/sandbox.py (original)
+++ subversion/branches/dump-load-cross-check/subversion/tests/cmdline/svntest/sandbox.py Mon Jan 26 18:28:33 2015
@@ -26,6 +26,7 @@ import shutil
 import copy
 import urllib
 import logging
+import re
 
 import svntest
 
@@ -46,6 +47,8 @@ class Sandbox:
     # This flag is set to True by build() and returned by is_built()
     self._is_built = False
 
+    self.was_cwd = os.getcwd()
+
   def _set_name(self, name, read_only=False):
     """A convenience method for renaming a sandbox, useful when
     working with multiple repositories in the same unit test."""
@@ -381,6 +384,105 @@ class Sandbox:
     youngest = int(output[0])
     return youngest
 
+  def verify_repo(self):
+    """
+    """
+    svnrdump_headers_missing = re.compile(
+        "Text-content-sha1: .*|Text-copy-source-md5: .*|"
+        "Text-copy-source-sha1: .*|Text-delta-base-sha1: .*"
+    )
+    svnrdump_headers_always = re.compile(
+        "Prop-delta: .*"
+    )
+
+    dumpfile_a_n = svntest.actions.run_and_verify_dump(self.repo_dir,
+                                                       deltas=False)
+    dumpfile_a_d = svntest.actions.run_and_verify_dump(self.repo_dir,
+                                                       deltas=True)
+    dumpfile_r_d = svntest.actions.run_and_verify_svnrdump(
+                             None, svntest.verify.AnyOutput, [], 0,
+                             'dump', '-q', self.repo_url)
+
+    # Compare the two deltas dumpfiles, ignoring expected differences
+    dumpfile_a_d_cmp = [l for l in dumpfile_a_d
+                       if not svnrdump_headers_missing.match(l)
+                                and not svnrdump_headers_always.match(l)]
+    dumpfile_r_d_cmp = [l for l in dumpfile_r_d
+                       if not svnrdump_headers_always.match(l)]
+    svntest.verify.compare_dump_files(None, None,
+                                      dumpfile_a_d_cmp,
+                                      dumpfile_r_d_cmp)
+
+    # Try loading the dump files.
+    # For extra points, load each with the other tool:
+    #   svnadmin dump | svnrdump load
+    #   svnrdump dump | svnadmin load
+    repo_dir_a_n, repo_url_a_n = self.add_repo_path('load_a_n')
+    svntest.main.create_repos(repo_dir_a_n)
+    svntest.actions.enable_revprop_changes(repo_dir_a_n)
+    svntest.actions.run_and_verify_svnrdump(dumpfile_a_n,
+                                            svntest.verify.AnyOutput,
+                                            [], 0, 'load', repo_url_a_n)
+
+    repo_dir_a_d, repo_url_a_d = self.add_repo_path('load_a_d')
+    svntest.main.create_repos(repo_dir_a_d)
+    svntest.actions.enable_revprop_changes(repo_dir_a_d)
+    svntest.actions.run_and_verify_svnrdump(dumpfile_a_d,
+                                            svntest.verify.AnyOutput,
+                                            [], 0, 'load', repo_url_a_d)
+
+    repo_dir_r_d, repo_url_r_d = self.add_repo_path('load_r_d')
+    svntest.main.create_repos(repo_dir_r_d)
+    svntest.actions.run_and_verify_load(repo_dir_r_d, dumpfile_r_d)
+
+    # Dump the loaded repositories in the same way; expect exact equality
+    reloaded_dumpfile_a_n = svntest.actions.run_and_verify_dump(repo_dir_a_n)
+    reloaded_dumpfile_a_d = svntest.actions.run_and_verify_dump(repo_dir_a_d)
+    reloaded_dumpfile_r_d = svntest.actions.run_and_verify_dump(repo_dir_r_d)
+    svntest.verify.compare_dump_files(None, None,
+                                      reloaded_dumpfile_a_n,
+                                      reloaded_dumpfile_a_d,
+                                      ignore_uuid=True)
+    svntest.verify.compare_dump_files(None, None,
+                                      reloaded_dumpfile_a_d,
+                                      reloaded_dumpfile_r_d,
+                                      ignore_uuid=True)
+
+    # Run each dump through svndumpfilter and check for no further change.
+    for dumpfile in [dumpfile_a_n,
+                     dumpfile_a_d,
+                     dumpfile_r_d
+                     ]:
+      ### No buffer size seems to work for update_tests-2. So skip that test?
+      ### (Its dumpfile size is ~360 KB non-delta, ~180 KB delta.)
+      if len(''.join(dumpfile)) > 100000:
+        continue
+
+      exit_code, dumpfile2, errput = svntest.main.run_command_stdin(
+        svntest.main.svndumpfilter_binary, None, -1, True,
+        dumpfile, '--quiet', 'include', '/')
+      assert not exit_code and not errput
+      # Ignore empty prop sections in the input file during comparison, as
+      # svndumpfilter strips them.
+      svntest.verify.compare_dump_files(None, None, dumpfile, dumpfile2,
+                                        expect_content_length_always=True,
+                                        ignore_empty_prop_sections=True)
+
+  def verify(self):
+    """Do additional testing that should hold for any sandbox, such as
+       verifying that the repository can be dumped.
+    """
+    if self.is_built() and not self.read_only:
+      # verify that we can in fact dump the repo
+      # (except for the few tests that deliberately corrupt the repo)
+      os.chdir(self.was_cwd)
+      if os.path.exists(self.repo_dir):
+        print("Cross-checking dump/load...")
+        self.verify_repo()
+    else:
+      print("NOT testing dump: is_built=%d, read_only=%d"
+            % (self.is_built(), self.read_only))
+
 def is_url(target):
   return (target.startswith('^/')
           or target.startswith('file://')

Modified: subversion/branches/dump-load-cross-check/subversion/tests/cmdline/svntest/testcase.py
URL: http://svn.apache.org/viewvc/subversion/branches/dump-load-cross-check/subversion/tests/cmdline/svntest/testcase.py?rev=1654857&r1=1654856&r2=1654857&view=diff
==============================================================================
--- subversion/branches/dump-load-cross-check/subversion/tests/cmdline/svntest/testcase.py (original)
+++ subversion/branches/dump-load-cross-check/subversion/tests/cmdline/svntest/testcase.py Mon Jan 26 18:28:33 2015
@@ -173,7 +173,9 @@ class FunctionTestCase(TestCase):
     return os.path.splitext(os.path.basename(filename))[0]
 
   def run(self, sandbox):
-    return self.func(sandbox)
+    result = self.func(sandbox)
+    sandbox.verify()
+    return result
 
 
 class _XFail(TestCase):

Modified: subversion/branches/dump-load-cross-check/subversion/tests/cmdline/svntest/verify.py
URL: http://svn.apache.org/viewvc/subversion/branches/dump-load-cross-check/subversion/tests/cmdline/svntest/verify.py?rev=1654857&r1=1654856&r2=1654857&view=diff
==============================================================================
--- subversion/branches/dump-load-cross-check/subversion/tests/cmdline/svntest/verify.py (original)
+++ subversion/branches/dump-load-cross-check/subversion/tests/cmdline/svntest/verify.py Mon Jan 26 18:28:33 2015
@@ -474,8 +474,10 @@ class DumpParser:
     if not m:
       if required:
         raise SVNDumpParseError("expected '%s' at line %d\n%s"
+                                "\nPrevious lines:\n%s"
                                 % (regex, self.current,
-                                   self.lines[self.current]))
+                                   self.lines[self.current],
+                                   ''.join(self.lines[max(0,self.current - 10):self.current])))
       else:
         return None
     self.current += 1
@@ -491,6 +493,26 @@ class DumpParser:
     self.current += 1
     return True
 
+  def parse_header(self, header):
+    regex = '([^:]*): (.*)$'
+    m = re.match(regex, self.lines[self.current])
+    if not m:
+      raise SVNDumpParseError("expected a header '%s' at line %d, but found:\n%s"
+                              % (regex, self.current,
+                                 self.lines[self.current]))
+    self.current += 1
+    return m.groups()
+
+  def parse_headers(self):
+    headers = []
+    while self.lines[self.current] != '\n':
+      key, val = self.parse_header(self)
+      headers.append((key, val))
+    return headers
+
+  def parse_boolean(self, header, required):
+    return self.parse_line(header + ': (false|true)$', required)
+
   def parse_format(self):
     return self.parse_line('SVN-fs-dump-format-version: ([0-9]+)$')
 
@@ -500,6 +522,9 @@ class DumpParser:
   def parse_revision(self):
     return self.parse_line('Revision-number: ([0-9]+)$')
 
+  def parse_prop_delta(self):
+    return self.parse_line('Prop-delta: (false|true)$', required=False)
+
   def parse_prop_length(self, required=True):
     return self.parse_line('Prop-content-length: ([0-9]+)$', required)
 
@@ -507,10 +532,7 @@ class DumpParser:
     return self.parse_line('Content-length: ([0-9]+)$', required)
 
   def parse_path(self):
-    path = self.parse_line('Node-path: (.+)$', required=False)
-    if not path and self.lines[self.current] == 'Node-path: \n':
-      self.current += 1
-      path = ''
+    path = self.parse_line('Node-path: (.*)$', required=False)
     return path
 
   def parse_kind(self):
@@ -541,6 +563,15 @@ class DumpParser:
   def parse_text_sha1(self):
     return self.parse_line('Text-content-sha1: ([0-9a-z]+)$', required=False)
 
+  def parse_text_delta(self):
+    return self.parse_line('Text-delta: (false|true)$', required=False)
+
+  def parse_text_delta_base_md5(self):
+    return self.parse_line('Text-delta-base-md5: ([0-9a-f]+)$', required=False)
+
+  def parse_text_delta_base_sha1(self):
+    return self.parse_line('Text-delta-base-sha1: ([0-9a-f]+)$', required=False)
+
   def parse_text_length(self):
     return self.parse_line('Text-content-length: ([0-9]+)$', required=False)
 
@@ -570,8 +601,14 @@ class DumpParser:
 
         return key
 
-      key = read_key_or_value(curprop)
-      value = read_key_or_value(curprop)
+      if props[curprop[0]].startswith('K'):
+        key = read_key_or_value(curprop)
+        value = read_key_or_value(curprop)
+      elif props[curprop[0]].startswith('D'):
+        key = read_key_or_value(curprop)
+        value = None
+      else:
+        raise
       prophash[key] = value
 
     return prophash
@@ -590,17 +627,28 @@ class DumpParser:
 
   def parse_one_node(self):
     node = {}
-    node['kind'] = self.parse_kind()
-    action = self.parse_action()
-    node['copyfrom_rev'] = self.parse_copyfrom_rev()
-    node['copyfrom_path'] = self.parse_copyfrom_path()
-    node['copy_md5'] = self.parse_copy_md5()
-    node['copy_sha1'] = self.parse_copy_sha1()
-    node['prop_length'] = self.parse_prop_length(required=False)
-    node['text_length'] = self.parse_text_length()
-    node['text_md5'] = self.parse_text_md5()
-    node['text_sha1'] = self.parse_text_sha1()
-    node['content_length'] = self.parse_content_length(required=False)
+    headers_list = self.parse_headers()
+    headers = { k:v for (k, v) in headers_list }
+    for key, header in [
+        ('kind', 'Node-kind'),
+        ('copyfrom_rev', 'Node-copyfrom-rev'),
+        ('copyfrom_path', 'Node-copyfrom-path'),
+        ('copy_md5', 'Text-copy-source-md5'),
+        ('copy_sha1', 'Text-copy-source-sha1'),
+        ('prop_delta', 'Prop-delta'),
+        ('prop_length', 'Prop-content-length'),
+        ('text_delta', 'Text-delta'),
+        ('text_delta_base_md5', 'Text-delta-base-md5'),
+        ('text_delta_base_sha1', 'Text-delta-base-sha1'),
+        ('text_length', 'Text-content-length'),
+        ('text_md5', 'Text-content-md5'),
+        ('text_sha1', 'Text-content-sha1'),
+        ('content_length', 'Content-length'),
+        ]:
+      node[key] = headers.get(header, None)
+
+    action = headers['Node-action']
+
     self.parse_blank()
     if node['prop_length']:
       node['props'] = self.get_props()
@@ -613,7 +661,9 @@ class DumpParser:
     blanks = 0
     while self.current < len(self.lines) and self.parse_blank(required=False):
       blanks += 1
-    node['blanks'] = blanks
+    ### disable temporarily, as svnrdump behaves differently from svnadmin
+    ### on a replace-with-copy (bug -- should file an issue)
+    #node['blanks'] = blanks
     return action, node
 
   def parse_all_nodes(self):
@@ -622,7 +672,7 @@ class DumpParser:
       if self.current >= len(self.lines):
         break
       path = self.parse_path()
-      if not path and not path is '':
+      if path is None:
         break
       if not nodes.get(path):
         nodes[path] = {}
@@ -660,7 +710,10 @@ class DumpParser:
     self.parse_all_revisions()
     return self.parsed
 
-def compare_dump_files(message, label, expected, actual):
+def compare_dump_files(message, label, expected, actual,
+                       ignore_uuid=False,
+                       expect_content_length_always=False,
+                       ignore_empty_prop_sections=False):
   """Parse two dump files EXPECTED and ACTUAL, both of which are lists
   of lines as returned by run_and_verify_dump, and check that the same
   revisions, nodes, properties, etc. are present in both dumps.
@@ -669,8 +722,37 @@ def compare_dump_files(message, label, e
   parsed_expected = DumpParser(expected).parse()
   parsed_actual = DumpParser(actual).parse()
 
+  if ignore_uuid:
+    parsed_expected['uuid'] = '<ignored>'
+    parsed_actual['uuid'] = '<ignored>'
+
+  for parsed in [parsed_expected, parsed_actual]:
+    for rev_name, rev_record in parsed.items():
+      #print "Found %s" % (rev_name,)
+      if 'nodes' in rev_record:
+        #print "Found %s.%s" % (rev_name, 'nodes')
+        for path_name, path_record in rev_record['nodes'].items():
+          #print "Found %s.%s.%s" % (rev_name, 'nodes', path_name)
+          for action_name, action_record in path_record.items():
+            #print "Found %s.%s.%s.%s" % (rev_name, 'nodes', path_name, action_name)
+
+            if expect_content_length_always:
+              if action_record.get('content_length') == None:
+                #print 'Adding: %s.%s.%s.%s.%s' % (rev_name, 'nodes', path_name, action_name, 'content_length=0')
+                action_record['content_length'] = '0'
+            if ignore_empty_prop_sections:
+              if action_record.get('prop_length') == '10':
+                #print 'Removing: %s.%s.%s.%s.%s' % (rev_name, 'nodes', path_name, action_name, 'prop_length')
+                action_record['prop_length'] = None
+                del action_record['props']
+                old_content_length = int(action_record['content_length'])
+                action_record['content_length'] = str(old_content_length - 10)
+
   if parsed_expected != parsed_actual:
-    raise svntest.Failure('\n' + '\n'.join(ndiff(
+    print 'DIFF of raw dumpfiles (including expected differences)'
+    print ''.join(ndiff(expected, actual))
+    raise svntest.Failure('DIFF of parsed dumpfiles (ignoring expected differences)\n'
+                          + '\n'.join(ndiff(
           pprint.pformat(parsed_expected).splitlines(),
           pprint.pformat(parsed_actual).splitlines())))