You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@buildstream.apache.org by gi...@apache.org on 2020/12/29 13:13:21 UTC

[buildstream] branch jmac/cas_to_cas_oct_v2 created (now 68e1054)

This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a change to branch jmac/cas_to_cas_oct_v2
in repository https://gitbox.apache.org/repos/asf/buildstream.git.


      at 68e1054  Restructure of .import()

This branch includes the following new commits:

     new b833011  buildstream/storage/__init__.py: import CasBasedDirectory
     new 6aea589  tests/storage/virtual_directory_import.py: New file
     new d8cc12d  CasBasedDirectory: New way of doing list_relative_paths - not working yet
     new 1700322  Tests: add test_directory_listing
     new 64763b3  _casbaseddirectory: Corrections to list_relative_files, adds _resolve
     new 0e6901c  _casbaseddirectory: Optionally resolve absolute symlinks
     new dde560d  _casbaseddirectory: _list_relative_paths now passes all tests
     new f242af8  _casbaseddirectory: Ingore '.' when resolving
     new ef7c4cd  virtual_directory_import.py: Add random test generator
     new c996eb7  virtual_directory_import.py: Check import via a file-based directory
     new 9a24fd1  Add code necessary to do cas-to-cas import
     new 2a47105  Add a tool to show differences in two CAS directories
     new 56a25c1  Correct deleting and overwriting cases
     new 6d16d70  Fix 'remove_item'->delete_entry
     new 6b741f3  casbaseddirectory: Various fixes.
     new f0cc6d1  Virtual directory test: Expand random testing to 6 roots
     new 0961607  CASBasedDirectory: Do not sort the input file list!
     new 44c5f2e  CASBasedDirectory: Import '.'
     new c7467e7  Don't forcbily create directories in _resolve in all cases
     new 7677f3f  CAS-to-CAS: Now passing all 20x20 tests
     new 4a892fa  Separation of fixed/random tests in virtual_directory_import
     new 14623fe  hack: remove files which previously blocked directory creation
     new 5fc7d6b  Detect infinite symlink loops in resolve()
     new af6b327  Make the duplication test optional in cas_based_directory
     new fa98975  virtual_directory_test.py: More fixed examples and better test names
     new b89b786  CasBasedDirectory: Remove some prints
     new ac897f8  Make virtual_directory_test do the cas roundtrip test instead of _casbaseddirectory
     new 7d9d395  casbaseddirectory: Remove roundtrip checking code
     new 8205b88  _casbaseddirectory.py: Remove some unnecessary things, label others
     new 5d2a97b  casbaseddirectory: Combine all the _resolve functions
     new e5e5be9  Rearrange comment
     new 505c6dd  Add a main() to virtual_directory_test.py to allow manual testing
     new 0f429c9  CasBasedDirectory: Remove 6 functions and rename files_in_subdir -> _files_in_subdir
     new f206850  _casbaseddirectory.py: _resolve_symlink_or_directory -> _force_resolve
     new 1a0669a  casbaseddirectory: Replace one instance of _force_resolve with descend
     new c74845d  Remove create_directory
     new 3cced84  virtual_directory_test: PEP8
     new 7bb8f20  Remove some prints and whitespace
     new 3c70496  Rename _add_new_link and remove duplicated code
     new 545219d  Remove some prints and improve comments
     new c9cfcf0  _casbaseddirectory: Restructure resolve to make it a bit more logical
     new 285b6f3  Remove _symlink_target_is_directory
     new 68e1054  Restructure of .import()

The 43 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.



[buildstream] 42/43: Remove _symlink_target_is_directory

Posted by gi...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a commit to branch jmac/cas_to_cas_oct_v2
in repository https://gitbox.apache.org/repos/asf/buildstream.git

commit 285b6f3347327773df8d67af1ff3ac7d7341d3fa
Author: Jim MacArthur <ji...@codethink.co.uk>
AuthorDate: Tue Oct 30 12:21:55 2018 +0000

    Remove _symlink_target_is_directory
---
 buildstream/storage/_casbaseddirectory.py | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/buildstream/storage/_casbaseddirectory.py b/buildstream/storage/_casbaseddirectory.py
index cd06649..5fc4099 100644
--- a/buildstream/storage/_casbaseddirectory.py
+++ b/buildstream/storage/_casbaseddirectory.py
@@ -486,9 +486,6 @@ class CasBasedDirectory(Directory):
             dirname += os.path.sep
         return [f[len(dirname):] for f in sorted_files if f.startswith(dirname)]
 
-    def _symlink_target_is_directory(self, symlink_node):
-        x = self._resolve(symlink_node.name)
-        return isinstance(x, CasBasedDirectory)
 
     def _partial_import_cas_into_cas(self, source_directory, files, path_prefix="", file_list_required=True):
         """ Import only the files and symlinks listed in 'files' from source_directory to this one.


[buildstream] 13/43: Correct deleting and overwriting cases

Posted by gi...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a commit to branch jmac/cas_to_cas_oct_v2
in repository https://gitbox.apache.org/repos/asf/buildstream.git

commit 56a25c1e7ca30aa1d60ea8d3ca75def218a67057
Author: Jim MacArthur <ji...@codethink.co.uk>
AuthorDate: Tue Oct 23 16:16:18 2018 +0100

    Correct deleting and overwriting cases
---
 buildstream/storage/_casbaseddirectory.py | 187 +++++++++++++++++++++---------
 1 file changed, 134 insertions(+), 53 deletions(-)

diff --git a/buildstream/storage/_casbaseddirectory.py b/buildstream/storage/_casbaseddirectory.py
index f624e34..f7ef35b 100644
--- a/buildstream/storage/_casbaseddirectory.py
+++ b/buildstream/storage/_casbaseddirectory.py
@@ -41,6 +41,8 @@ from ..utils import FileListResult, safe_copy, list_relative_paths
 from ..utils import FileListResult, safe_copy, list_relative_paths, _relative_symlink_target
 from .._artifactcache.cascache import CASCache
 
+import copy # Temporary
+import operator
 
 class IndexEntry():
     """ Used in our index of names to objects to store the 'modified' flag
@@ -85,7 +87,9 @@ class CasBasedDirectory(Directory):
         if ref:
             with open(self.cas_cache.objpath(ref), 'rb') as f:
                 self.pb2_directory.ParseFromString(f.read())
-
+                print("Opening ref {} and parsed into directory containing: {} {} {}.".format(ref.hash, [d.name for d in self.pb2_directory.directories],
+                                                                                        [d.name for d in self.pb2_directory.symlinks],
+                                                                                        [d.name for d in self.pb2_directory.files]))
         self.ref = ref
         self.index = OrderedDict()
         self.parent = parent
@@ -224,11 +228,27 @@ class CasBasedDirectory(Directory):
         symlinknode.target = os.readlink(os.path.join(basename, filename))
         self.index[filename] = IndexEntry(symlinknode, modified=(existing_link is not None))
 
+    def _add_new_link_direct(self, name, target):
+        existing_link = self._find_pb2_entry(name)
+        if existing_link:
+            symlinknode = existing_link
+        else:
+            symlinknode = self.pb2_directory.symlinks.add()
+        assert(isinstance(symlinknode, remote_execution_pb2.SymlinkNode))
+        symlinknode.name = name
+        # A symlink node has no digest.
+        symlinknode.target = target
+        self.index[name] = IndexEntry(symlinknode, modified=(existing_link is not None))
+
+        
     def delete_entry(self, name):
         for collection in [self.pb2_directory.files, self.pb2_directory.symlinks, self.pb2_directory.directories]:
-            if name in collection:
-                collection.remove(name)
+            for thing in collection:
+                if thing.name == name:
+                    print("Removing {} from PB2".format(name))
+                    collection.remove(thing)
         if name in self.index:
+            print("Removing {} from index".format(name))
             del self.index[name]
 
     def descend(self, subdirectory_spec, create=False):
@@ -432,17 +452,21 @@ class CasBasedDirectory(Directory):
             return True
         if (isinstance(existing_entry,
                        (remote_execution_pb2.FileNode, remote_execution_pb2.SymlinkNode))):
+            self.delete_entry(name)
+            print("Processing overwrite of file/symlink {}: Added to overwritten list and deleted".format(name))
             fileListResult.overwritten.append(relative_pathname)
             return True
         elif isinstance(existing_entry, remote_execution_pb2.DirectoryNode):
             # If 'name' maps to a DirectoryNode, then there must be an entry in index
             # pointing to another Directory.
             if self.index[name].buildstream_object.is_empty():
+                print("Processing overwrite of directory: Removing original")
                 self.delete_entry(name)
                 fileListResult.overwritten.append(relative_pathname)
                 return True
             else:
                 # We can't overwrite a non-empty directory, so we just ignore it.
+                print("Processing overwrite of non-empty directory: Ignoring overwrite")
                 fileListResult.ignored.append(relative_pathname)
                 return False
         assert False, ("Entry '{}' is not a recognised file/link/directory and not None; it is {}"
@@ -466,6 +490,9 @@ class CasBasedDirectory(Directory):
         """ Imports files from a traditional directory """
         result = FileListResult()
         for entry in sorted(files):
+            print("Importing {} from file system".format(entry))
+            print("...Order of elements was {}".format(", ".join(self.index.keys())))
+
             split_path = entry.split(os.path.sep)
             # The actual file on the FS we're importing
             import_file = os.path.join(source_directory, entry)
@@ -490,6 +517,8 @@ class CasBasedDirectory(Directory):
                 if self._check_replacement(entry, path_prefix, result):
                     self._add_new_file(source_directory, entry)
                     result.files_written.append(relative_pathname)
+            print("...Order of elements is now {}".format(", ".join(self.index.keys())))
+
         return result
 
 
@@ -546,6 +575,17 @@ class CasBasedDirectory(Directory):
         x = self._resolve_symlink(symlink_node)
         return isinstance(x, CasBasedDirectory)
 
+    def _verify_unique(self):
+        # Verifies that there are no duplicate names in this directory or subdirectories.
+        names = []
+        for entrylist in [self.pb2_directory.files, self.pb2_directory.directories, self.pb2_directory.symlinks]:
+            for e in entrylist:
+                if e.name in names:
+                    raise VirtualDirectoryError("Duplicate entry for name {} found".format(e.name))
+                names.append(e.name)
+        for d in self.pb2_directory.directories:
+            self.index[d.name].buildstream_object._verify_unique()
+    
     def _partial_import_cas_into_cas(self, source_directory, files, path_prefix="", file_list_required=True):
         """ Import only the files and symlinks listed in 'files' from source_directory to this one.
         Args:
@@ -554,7 +594,7 @@ class CasBasedDirectory(Directory):
            path_prefix (str): Prefix used to add entries to the file list result.
            file_list_required: Whether to update the file list while processing.
         """
-        print("Beginning partial import of {} into {}".format(source_directory, self))
+        print("Beginning partial import of {} into {}. Files are: >{}<".format(source_directory, self, ", ".join(files)))
         result = FileListResult()
         processed_directories = set()
         for f in files:
@@ -582,17 +622,24 @@ class CasBasedDirectory(Directory):
                 self.create_directory(f)
             else:
                 # We're importing a file or symlink - replace anything with the same name.
-                self._check_replacement(f, path_prefix, result)
-                item = source_directory.index[f].pb_object
-                if isinstance(item, remote_execution_pb2.FileNode):
-                    filenode = self.pb2_directory.files.add(digest=item.digest, name=f,
-                                                            is_executable=item.is_executable)
-                    self.index[f] = IndexEntry(filenode, modified=(fullname in result.overwritten))
-                else:
-                    assert(isinstance(item, remote_execution_pb2.SymlinkNode))
-                    symlinknode = self.pb2_directory.symlinks.add(name=f, target=item.target)
-                    # A symlink node has no digest.
-                    self.index[f] = IndexEntry(symlinknode, modified=(fullname in result.overwritten))
+                print("Import of file/symlink {} into this directory. Removing anything existing...".format(f))
+                print("   ... ordering of nodes in this dir was: {}".format(self.index.keys()))
+                print("   ... symlinks were {}".format([x.name for x in self.pb2_directory.symlinks]))
+                importable = self._check_replacement(f, path_prefix, result)
+                if importable:
+                    print("   ... after replacement of '{}', symlinks are now {}".format(f, [x.name for x in self.pb2_directory.symlinks]))
+                    item = source_directory.index[f].pb_object
+                    if isinstance(item, remote_execution_pb2.FileNode):
+                        print("   ... importing file")
+                        filenode = self.pb2_directory.files.add(digest=item.digest, name=f,
+                                                                is_executable=item.is_executable)
+                        self.index[f] = IndexEntry(filenode, modified=(fullname in result.overwritten))
+                    else:
+                        print("   ... importing symlink")
+                        assert(isinstance(item, remote_execution_pb2.SymlinkNode))
+                        self._add_new_link_direct(name=f, target=item.target)
+                        print("   ... symlinks are now {}".format([x.name for x in self.pb2_directory.symlinks]))
+                    print("   ... ordering of nodes in this dir is now: {}".format(self.index.keys()))
         return result
 
     def transfer_node_contents(destination, source):
@@ -638,47 +685,57 @@ class CasBasedDirectory(Directory):
         """
         if files is None:
             #return self._full_import_cas_into_cas(source_directory, can_hardlink=True)
-            files = source_directory.list_relative_paths()
+            files = list(source_directory.list_relative_paths())
             print("Extracted all files from source directory '{}': {}".format(source_directory, files))
-        return self._partial_import_cas_into_cas(source_directory, files)
+        return self._partial_import_cas_into_cas(source_directory, list(files))
 
     def showdiff(self, other):
         print("Diffing {} and {}:".format(self, other))
-        l1 = list(self.index.items())
-        l2 = list(other.index.items())
-        for (key, value) in l1:
-            if len(l2) == 0:
-                print("'Other' is short: no item to correspond to '{}' in first.".format(key))
-                return
-            (key2, value2) = l2.pop(0)
-            if key != key2:
-                print("Mismatch: item named {} in first, named {} in second".format(key, key2))
-                return
-            if type(value.pb_object) != type(value2.pb_object):
-                print("Mismatch: item named {}'s pb_object is a {} in first and a {} in second".format(key, type(value.pb_object), type(value2.pb_object)))
-                return
-            if type(value.buildstream_object) != type(value2.buildstream_object):
-                print("Mismatch: item named {}'s buildstream_object is a {} in first and a {} in second".format(key, type(value.buildstream_object), type(value2.buildstream_object)))
-                return
-            print("Inspecting {} of type {}".format(key, type(value.pb_object)))
-            if type(value.pb_object) == remote_execution_pb2.DirectoryNode:
-                # It's a directory, follow it
-                self.descend(key).showdiff(other.descend(key))
-            elif type(value.pb_object) == remote_execution_pb2.SymlinkNode:
-                target1 = value.pb_object.target
-                target2 = value2.pb_object.target
-                if target1 != target2:
-                    print("Symlink named {}: targets do not match. {} in the first, {} in the second".format(key, target1, target2))
-            elif type(value.pb_object) == remote_execution_pb2.FileNode:
-                if value.pb_object.digest != value2.pb_object.digest:
-                    print("File named {}: digests do not match. {} in the first, {} in the second".format(key, value.pb_object.digest, value2.pb_object.digest))
-        if len(l2) != 0:
-            print("'Other' is long: it contains extra items called: {}".format(", ".join([i[0] for i in l2])))
-            return
+
+        def compare_list(l1, l2):
+            item2 = None
+            index = 0
+            print("Comparing lists: {} vs {}".format([d.name for d in l1], [d.name for d in l2]))
+            for item1 in l1:
+                if index>=len(l2):
+                    print("l2 is short: no item to correspond to '{}' in l1.".format(item1.name))
+                    return False
+                item2 = l2[index]
+                if item1.name != item2.name:
+                    print("Items do not match: {} in l1, {} in l2".format(item1.name, item2.name))
+                    return False
+                index += 1
+            if index != len(l2):
+                print("l2 is long: Has extra items {}".format(l2[index:]))
+                return False
+            return True
+
+        def compare_pb2_directories(d1, d2):
+            result = (compare_list(d1.directories, d2.directories)
+                    and compare_list(d1.symlinks, d2.symlinks)
+                    and compare_list(d1.files, d2.files))
+            return result
+                        
+        if not compare_pb2_directories(self.pb2_directory, other.pb2_directory):
+            return False
+
+        for d in self.pb2_directory.directories:
+            self.index[d.name].buildstream_object.showdiff(other.index[d.name].buildstream_object)
         print("No differences found in {}".format(self))
               
+    def show_files_recursive(self):
+        elems = []
+        for (k,v) in self.index.items():
+            if type(v.pb_object) == remote_execution_pb2.DirectoryNode:
+                elems.append("{}=[{}]".format(k, v.buildstream_object.show_files_recursive()))
+            elif type(v.pb_object) == remote_execution_pb2.SymlinkNode:
+                elems.append("{}(s)".format(k))
+            elif type(v.pb_object) == remote_execution_pb2.FileNode:
+                elems.append("{}(f)".format(k))
+            else:
+                elems.append("{}(?)".format(k))
+        return " ".join(elems)
         
-    
     def import_files(self, external_pathspec, *, files=None,
                      report_written=True, update_utimes=False,
                      can_link=False):
@@ -701,12 +758,30 @@ class CasBasedDirectory(Directory):
         can_link (bool): Ignored, since hard links do not have any meaning within CAS.
         """
 
+        print("Directory before import: {}".format(self.show_files_recursive()))
+
+        # Sync self
+        self._recalculate_recursing_down()
+        if self.parent:
+            self.parent._recalculate_recursing_up(self)
+        
+        # Duplicate the current directory
+
+        
+        print("Original CAS before CAS-based import: {}".format(self.show_files_recursive()))
+        print("Original CAS hash: {}".format(self.ref.hash))
         duplicate_cas = None
+        self._verify_unique()
         if isinstance(external_pathspec, CasBasedDirectory):
+            duplicate_cas = CasBasedDirectory(self.context, ref=copy.copy(self.ref))
+            duplicate_cas._verify_unique()
+            print("-"*80 + "Performing direct CAS-to-CAS import")
+            print("Duplicated CAS before file-based import: {}".format(duplicate_cas.show_files_recursive()))
+            print("Duplicate CAS hash: {}".format(duplicate_cas.ref.hash))
             result = self._import_cas_into_cas(external_pathspec, files=files)
-
-            # Duplicate the current directory and do an import that way.
-            duplicate_cas = CasBasedDirectory(self.context, ref=self.ref)
+            self._verify_unique()
+            print("Result of cas-to-cas import: {}".format(self.show_files_recursive()))
+            print("-"*80 + "Performing round-trip import via file system")
             with tempfile.TemporaryDirectory(prefix="roundtrip") as tmpdir:
                 external_pathspec.export_files(tmpdir)
                 if files is None:
@@ -714,8 +789,12 @@ class CasBasedDirectory(Directory):
                 duplicate_cas._import_files_from_directory(tmpdir, files=files)
                 duplicate_cas._recalculate_recursing_down()
                 if duplicate_cas.parent:
-                    duplicate_cas.parent._recalculate_recursing_up(self)
+                    duplicate_cas.parent._recalculate_recursing_up(duplicate_cas)
+                print("Result of direct import: {}".format(duplicate_cas.show_files_recursive()))
+               
+
         else:
+            print("-"*80 + "Performing initial import")
             if isinstance(external_pathspec, FileBasedDirectory):
                 source_directory = external_pathspec.get_underlying_directory()
             else:
@@ -800,6 +879,7 @@ class CasBasedDirectory(Directory):
         for entry in self.pb2_directory.symlinks:
             src_name = os.path.join(to_directory, entry.name)
             target_name = entry.target
+            print("Exporting symlink named {}".format(src_name))
             try:
                 os.symlink(target_name, src_name)
             except FileExistsError as e:
@@ -900,6 +980,7 @@ class CasBasedDirectory(Directory):
         for (k, v) in sorted(directory_list):
             print("Yielding from subdirectory name {}".format(k))
             yield from v.buildstream_object.list_relative_paths(relpath=os.path.join(relpath, k))
+        print("List_relative_paths on {} complete".format(relpath))
 
     def recalculate_hash(self):
         """ Recalcuates the hash for this directory and store the results in


[buildstream] 07/43: _casbaseddirectory: _list_relative_paths now passes all tests

Posted by gi...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a commit to branch jmac/cas_to_cas_oct_v2
in repository https://gitbox.apache.org/repos/asf/buildstream.git

commit dde560d0e44edd884b8f8b562e8b1bc056d8a224
Author: Jim MacArthur <ji...@codethink.co.uk>
AuthorDate: Thu Oct 4 17:52:15 2018 +0100

    _casbaseddirectory: _list_relative_paths now passes all tests
---
 buildstream/storage/_casbaseddirectory.py | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/buildstream/storage/_casbaseddirectory.py b/buildstream/storage/_casbaseddirectory.py
index ef7bb68..22a6664 100644
--- a/buildstream/storage/_casbaseddirectory.py
+++ b/buildstream/storage/_casbaseddirectory.py
@@ -611,16 +611,18 @@ class CasBasedDirectory(Directory):
         symlink_list = list(filter(lambda i: isinstance(i[1].pb_object, remote_execution_pb2.SymlinkNode), self.index.items()))
         file_list = list(filter(lambda i: isinstance(i[1].pb_object, remote_execution_pb2.FileNode), self.index.items()))
         directory_list = list(filter(lambda i: isinstance(i[1].buildstream_object, CasBasedDirectory), self.index.items()))
+        symlinks_to_directories_list = []
         print("Running list_relative_paths on relpath {}. files={}, symlinks={}".format(relpath, [f[0] for f in file_list], [s[0] for s in symlink_list]))
 
         for (k, v) in sorted(symlink_list):
             target = self._resolve(k, absolute_symlinks_resolve=True)
             if isinstance(target, CasBasedDirectory):
-                print("Adding the resolved symlink {} which resolves to {} to our directory list".format(k, target))
-                directory_list.append((k,IndexEntry(k, buildstream_object=target)))
+                symlinks_to_directories_list.append(k)
             else:
                 # Broken symlinks are also considered files!
                 file_list.append((k,v))
+        for d in sorted(symlinks_to_directories_list):
+            yield os.path.join(relpath, d)
         if file_list == [] and relpath != "":
             print("Yielding empty directory name {}".format(relpath))
             yield relpath


[buildstream] 19/43: Don't forcbily create directories in _resolve in all cases

Posted by gi...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a commit to branch jmac/cas_to_cas_oct_v2
in repository https://gitbox.apache.org/repos/asf/buildstream.git

commit c7467e742f882346111d6ba1c312a7b72035f916
Author: Jim MacArthur <ji...@codethink.co.uk>
AuthorDate: Wed Oct 24 14:41:21 2018 +0100

    Don't forcbily create directories in _resolve in all cases
---
 buildstream/storage/_casbaseddirectory.py | 17 ++++++++++++-----
 1 file changed, 12 insertions(+), 5 deletions(-)

diff --git a/buildstream/storage/_casbaseddirectory.py b/buildstream/storage/_casbaseddirectory.py
index d414723..a7ace32 100644
--- a/buildstream/storage/_casbaseddirectory.py
+++ b/buildstream/storage/_casbaseddirectory.py
@@ -289,7 +289,7 @@ class CasBasedDirectory(Directory):
                 return entry.descend(subdirectory_spec[1:], create)
             else:
                 # May be a symlink
-                target = self._resolve(subdirectory_spec[0])
+                target = self._resolve(subdirectory_spec[0], force_create=create)
                 if isinstance(target, CasBasedDirectory):
                     return target
                 error = "Cannot descend into {}, which is a '{}' in the directory {}"
@@ -382,7 +382,7 @@ class CasBasedDirectory(Directory):
         return directory
 
     
-    def _resolve(self, name, absolute_symlinks_resolve=True):
+    def _resolve(self, name, absolute_symlinks_resolve=True, force_create=False):
         """ Resolves any name to an object. If the name points to a symlink in
         this directory, it returns the thing it points to,
         recursively. Returns a CasBasedDirectory, FileNode or
@@ -442,14 +442,21 @@ class CasBasedDirectory(Directory):
                     else:
                         # This is a file or None (i.e. broken symlink)
                         print("  resolving {}: file/broken link".format(c))
-                        if components:
+                        if f is None and force_create:
+                            print("Creating target of broken link {}".format(c))
+                            return directory.descend(c, create=True)
+                        elif components:
                             # Oh dear. We have components left to resolve, but the one we're trying to resolve points to a file.
                             raise VirtualDirectoryError("Reached a file called {} while trying to resolve a symlink; cannot proceed".format(c))
                         else:
                             return f
                 else:
-                    print("  resolving {}: Broken symlink".format(c))
-                    return None
+                    print("  resolving {}: Non-existent file; must be from a broken symlink.".format(c))
+                    if force_create:
+                        print("Creating target of broken link {} (2)".format(c))
+                        return directory.descend(c, create=True)
+                    else:
+                        return None
 
         # Shouldn't get here.
         


[buildstream] 08/43: _casbaseddirectory: Ingore '.' when resolving

Posted by gi...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a commit to branch jmac/cas_to_cas_oct_v2
in repository https://gitbox.apache.org/repos/asf/buildstream.git

commit f242af89126ede07b8dc028aea4eaf570f5a468e
Author: Jim MacArthur <ji...@codethink.co.uk>
AuthorDate: Fri Oct 5 13:40:57 2018 +0100

    _casbaseddirectory: Ingore '.' when resolving
---
 buildstream/storage/_casbaseddirectory.py | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/buildstream/storage/_casbaseddirectory.py b/buildstream/storage/_casbaseddirectory.py
index 22a6664..286d672 100644
--- a/buildstream/storage/_casbaseddirectory.py
+++ b/buildstream/storage/_casbaseddirectory.py
@@ -281,7 +281,9 @@ class CasBasedDirectory(Directory):
         directory = root
         components = symlink.target.split(CasBasedDirectory._pb2_path_sep)
         for c in components:
-            if c == "..":
+            if c == ".":
+                pass
+            elif c == "..":
                 directory = directory.parent
             else:
                 directory = directory.descend(c, create=True)
@@ -322,7 +324,9 @@ class CasBasedDirectory(Directory):
                 # We ran out of path elements and ended up in a directory
                 return directory
             c = components.pop(0)
-            if c == "..":
+            if c == ".":
+                pass
+            elif c == "..":
                 print("  resolving {}: up-dir".format(c))
                 # If directory.parent *is* None, this is an attempt to access
                 # '..' from the root, which is valid under POSIX; it just


[buildstream] 30/43: casbaseddirectory: Combine all the _resolve functions

Posted by gi...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a commit to branch jmac/cas_to_cas_oct_v2
in repository https://gitbox.apache.org/repos/asf/buildstream.git

commit 5d2a97b1bc346ea6a70c4f97edcd0535375cac3f
Author: Jim MacArthur <ji...@codethink.co.uk>
AuthorDate: Thu Oct 25 18:02:12 2018 +0100

    casbaseddirectory: Combine all the _resolve functions
---
 buildstream/storage/_casbaseddirectory.py | 58 ++++---------------------------
 1 file changed, 7 insertions(+), 51 deletions(-)

diff --git a/buildstream/storage/_casbaseddirectory.py b/buildstream/storage/_casbaseddirectory.py
index 64f867b..c5768a8 100644
--- a/buildstream/storage/_casbaseddirectory.py
+++ b/buildstream/storage/_casbaseddirectory.py
@@ -329,26 +329,7 @@ class CasBasedDirectory(Directory):
         as a directory as long as it's within this directory tree.
         """
 
-        if isinstance(self.index[name].buildstream_object, Directory):
-            return self.index[name].buildstream_object
-        # OK then, it's a symlink
-        symlink = self._find_pb2_entry(name)
-        assert isinstance(symlink, remote_execution_pb2.SymlinkNode)
-        absolute = symlink.target.startswith(CasBasedDirectory._pb2_absolute_path_prefix)
-        if absolute:
-            root = self.find_root()
-        else:
-            root = self
-        directory = root
-        components = symlink.target.split(CasBasedDirectory._pb2_path_sep)
-        for c in components:
-            if c == ".":
-                pass
-            elif c == "..":
-                directory = directory.parent
-            else:
-                directory = directory.descend(c, create=True)
-        return directory
+        return self._resolve(name, force_create=True)
 
     def _is_followable(self, name):
         """ Returns true if this is a directory or symlink to a valid directory. """
@@ -363,35 +344,16 @@ class CasBasedDirectory(Directory):
     def _resolve_symlink(self, node, force_create=True):
         """Same as _resolve_symlink_or_directory but takes a SymlinkNode.
         """
-
-        # OK then, it's a symlink
-        symlink = node
-        absolute = symlink.target.startswith(CasBasedDirectory._pb2_absolute_path_prefix)
-        if absolute:
-            root = self.find_root()
-        else:
-            root = self
-        directory = root
-        components = symlink.target.split(CasBasedDirectory._pb2_path_sep)
-        for c in components:
-            if c == ".":
-                pass
-            elif c == "..":
-                directory = directory.parent
-            else:
-                if c in directory.index or force_create:
-                    directory = directory.descend(c, create=True)
-                else:
-                    return None
-        return directory
-
+        return self._resolve(node.name, force_create=True)
     
     def _resolve(self, name, absolute_symlinks_resolve=True, force_create=False, first_seen_object = None):
         """ Resolves any name to an object. If the name points to a symlink in
         this directory, it returns the thing it points to,
         recursively. Returns a CasBasedDirectory, FileNode or
-        None. Never creates a directory or otherwise alters the
-        directory.
+        None.
+
+        If force_create is on, will attempt to create directories to make symlinks and directories resolve.
+        If force_create is off, this will never alter this directory.
 
         """
         # First check if it's a normal object and return that
@@ -438,7 +400,6 @@ class CasBasedDirectory(Directory):
             if c == ".":
                 pass
             elif c == "..":
-                print("  resolving {}: up-dir".format(c))
                 # If directory.parent *is* None, this is an attempt to access
                 # '..' from the root, which is valid under POSIX; it just
                 # returns the root.                
@@ -450,15 +411,12 @@ class CasBasedDirectory(Directory):
                     # Ultimately f must now be a file or directory
                     if isinstance(f, CasBasedDirectory):
                         directory = f
-                        print("  resolving {}: dir".format(c))
 
                     else:
                         # This is a file or None (i.e. broken symlink)
-                        print("  resolving {}: file/broken link".format(c))
                         if f is None and force_create:
-                            print("Creating target of broken link {}".format(c))
                             directory = directory.descend(c, create=True)
-                        elif components:
+                        elif components and force_create:
                             # Oh dear. We have components left to resolve, but the one we're trying to resolve points to a file.
                             print("Trying to resolve {}, but found {} was a file.".format(symlink.target, c))
                             self.delete_entry(c)
@@ -467,9 +425,7 @@ class CasBasedDirectory(Directory):
                         else:
                             return f
                 else:
-                    print("  resolving {}: Non-existent file; must be from a broken symlink.".format(c))
                     if force_create:
-                        print("Creating target of broken link {} (2)".format(c))
                         directory = directory.descend(c, create=True)
                     else:
                         return None


[buildstream] 18/43: CASBasedDirectory: Import '.'

Posted by gi...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a commit to branch jmac/cas_to_cas_oct_v2
in repository https://gitbox.apache.org/repos/asf/buildstream.git

commit 44c5f2e6018fa284275cb6f91fba1b66479cbc37
Author: Jim MacArthur <ji...@codethink.co.uk>
AuthorDate: Tue Oct 23 17:58:07 2018 +0100

    CASBasedDirectory: Import '.'
---
 buildstream/storage/_casbaseddirectory.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/buildstream/storage/_casbaseddirectory.py b/buildstream/storage/_casbaseddirectory.py
index 28598b5..d414723 100644
--- a/buildstream/storage/_casbaseddirectory.py
+++ b/buildstream/storage/_casbaseddirectory.py
@@ -618,7 +618,7 @@ class CasBasedDirectory(Directory):
         result = FileListResult()
         processed_directories = set()
         for f in files:
-            if f == ".": continue
+            #if f == ".": continue
             fullname = os.path.join(path_prefix, f)
             components = f.split(os.path.sep)
             if len(components)>1:


[buildstream] 17/43: CASBasedDirectory: Do not sort the input file list!

Posted by gi...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a commit to branch jmac/cas_to_cas_oct_v2
in repository https://gitbox.apache.org/repos/asf/buildstream.git

commit 09616076d5f58f015291b7a806b7d4e0122b53db
Author: Jim MacArthur <ji...@codethink.co.uk>
AuthorDate: Tue Oct 23 17:57:54 2018 +0100

    CASBasedDirectory: Do not sort the input file list!
---
 buildstream/storage/_casbaseddirectory.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/buildstream/storage/_casbaseddirectory.py b/buildstream/storage/_casbaseddirectory.py
index fd7b28a..28598b5 100644
--- a/buildstream/storage/_casbaseddirectory.py
+++ b/buildstream/storage/_casbaseddirectory.py
@@ -509,7 +509,7 @@ class CasBasedDirectory(Directory):
     def _import_files_from_directory(self, source_directory, files, path_prefix=""):
         """ Imports files from a traditional directory """
         result = FileListResult()
-        for entry in sorted(files):
+        for entry in files:
             print("Importing {} from file system".format(entry))
             print("...Order of elements was {}".format(", ".join(self.index.keys())))
 


[buildstream] 16/43: Virtual directory test: Expand random testing to 6 roots

Posted by gi...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a commit to branch jmac/cas_to_cas_oct_v2
in repository https://gitbox.apache.org/repos/asf/buildstream.git

commit f0cc6d1378e68bc6335dfa7d89765a7817a832b7
Author: Jim MacArthur <ji...@codethink.co.uk>
AuthorDate: Tue Oct 23 17:57:37 2018 +0100

    Virtual directory test: Expand random testing to 6 roots
---
 tests/storage/virtual_directory_import.py | 97 ++++++++++++++++---------------
 1 file changed, 50 insertions(+), 47 deletions(-)

diff --git a/tests/storage/virtual_directory_import.py b/tests/storage/virtual_directory_import.py
index 47b4935..24ef2e3 100644
--- a/tests/storage/virtual_directory_import.py
+++ b/tests/storage/virtual_directory_import.py
@@ -50,31 +50,33 @@ def generate_import_roots(directory):
                 os.symlink(content, os.path.join(rootdir, path))
 
 
-def generate_random_root(directory):
+def generate_random_roots(directory):
     random.seed(RANDOM_SEED)
-    rootname = "root6"
-    rootdir = os.path.join(directory, "content", rootname)
-    things = []
-    locations = ['.']
-    for i in range(0, 100):
-        location = random.choice(locations)
-        thingname = "node{}".format(i)
-        thing = random.choice(['dir', 'link', 'file'])
-        target = os.path.join(rootdir, location, thingname)
-        if thing == 'dir':
-            os.makedirs(target)
-            locations.append(os.path.join(location, thingname))
-        elif thing == 'file':
-            with open(target, "wt") as f:
-                f.write("This is node {}\n".format(i))
-        elif thing == 'link':
-            # TODO: Make some relative symlinks
-            if random.randint(1, 3) == 1 or len(things) == 0:
-                os.symlink("/broken", target)
-            else:
-                os.symlink(random.choice(things), target)
-        things.append(os.path.join(location, thingname))
-        print("Generated {}/{} ".format(rootdir, things[-1]))
+    for rootno in range(6,13):
+        rootname = "root{}".format(rootno)
+        rootdir = os.path.join(directory, "content", rootname)
+        things = []
+        locations = ['.']
+        os.makedirs(rootdir)
+        for i in range(0, 100):
+            location = random.choice(locations)
+            thingname = "node{}".format(i)
+            thing = random.choice(['dir', 'link', 'file'])
+            target = os.path.join(rootdir, location, thingname)
+            if thing == 'dir':
+                os.makedirs(target)
+                locations.append(os.path.join(location, thingname))
+            elif thing == 'file':
+                with open(target, "wt") as f:
+                    f.write("This is node {}\n".format(i))
+            elif thing == 'link':
+                # TODO: Make some relative symlinks
+                if random.randint(1, 3) == 1 or len(things) == 0:
+                    os.symlink("/broken", target)
+                else:
+                    os.symlink(random.choice(things), target)
+            things.append(os.path.join(location, thingname))
+            print("Generated {}/{} ".format(rootdir, things[-1]))
 
 
 def file_contents(path):
@@ -141,39 +143,40 @@ def directory_not_empty(path):
     return os.listdir(path)
 
 
-@pytest.mark.parametrize("original,overlay", combinations([1, 2, 3, 4, 5]))
+@pytest.mark.parametrize("original,overlay", combinations([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]))
 def test_cas_import(cli, tmpdir, original, overlay):
     fake_context = FakeContext()
     fake_context.artifactdir = tmpdir
     # Create some fake content
     generate_import_roots(tmpdir)
-    generate_random_root(tmpdir)
+    generate_random_roots(tmpdir)
     d = create_new_casdir(original, fake_context, tmpdir)
     d2 = create_new_casdir(overlay, fake_context, tmpdir)
     print("Importing dir {} into {}".format(overlay, original))
     d.import_files(d2)
     d.export_files(os.path.join(tmpdir, "output"))
     
-    for item in root_filesets[overlay - 1]:
-        (path, typename, content) = item
-        realpath = resolve_symlinks(path, os.path.join(tmpdir, "output"))
-        if typename == 'F':
-            if os.path.isdir(realpath) and directory_not_empty(realpath):
-                # The file should not have overwritten the directory in this case.
-                pass
-            else:
-                assert os.path.isfile(realpath), "{} did not exist in the combined virtual directory".format(path)
-                assert file_contents_are(realpath, content)
-        elif typename == 'S':
-            if os.path.isdir(realpath) and directory_not_empty(realpath):
-                # The symlink should not have overwritten the directory in this case.
-                pass
-            else:
-                assert os.path.islink(realpath)
-                assert os.readlink(realpath) == content
-        elif typename == 'D':
-            # Note that isdir accepts symlinks to dirs, so a symlink to a dir is acceptable.
-            assert os.path.isdir(realpath)
+    if overlay < 6:
+        for item in root_filesets[overlay - 1]:
+            (path, typename, content) = item
+            realpath = resolve_symlinks(path, os.path.join(tmpdir, "output"))
+            if typename == 'F':
+                if os.path.isdir(realpath) and directory_not_empty(realpath):
+                    # The file should not have overwritten the directory in this case.
+                    pass
+                else:
+                    assert os.path.isfile(realpath), "{} did not exist in the combined virtual directory".format(path)
+                    assert file_contents_are(realpath, content)
+            elif typename == 'S':
+                if os.path.isdir(realpath) and directory_not_empty(realpath):
+                    # The symlink should not have overwritten the directory in this case.
+                    pass
+                else:
+                    assert os.path.islink(realpath)
+                    assert os.readlink(realpath) == content
+            elif typename == 'D':
+                # Note that isdir accepts symlinks to dirs, so a symlink to a dir is acceptable.
+                assert os.path.isdir(realpath)
 
     # Now do the same thing with filebaseddirectories and check the contents match
     d3 = create_new_casdir(original, fake_context, tmpdir)
@@ -188,7 +191,7 @@ def test_directory_listing(cli, tmpdir, root):
     fake_context.artifactdir = tmpdir
     # Create some fake content
     generate_import_roots(tmpdir)
-    generate_random_root(tmpdir)
+    generate_random_roots(tmpdir)
 
     d = create_new_filedir(root, tmpdir)
     filelist = list(d.list_relative_paths())


[buildstream] 11/43: Add code necessary to do cas-to-cas import

Posted by gi...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a commit to branch jmac/cas_to_cas_oct_v2
in repository https://gitbox.apache.org/repos/asf/buildstream.git

commit 9a24fd164edd58967de0a7c17b6379f2c592a817
Author: Jim MacArthur <ji...@codethink.co.uk>
AuthorDate: Fri Oct 19 17:43:08 2018 +0100

    Add code necessary to do cas-to-cas import
---
 buildstream/storage/_casbaseddirectory.py | 246 ++++++++++++++++++++++++++++--
 tests/storage/virtual_directory_import.py |   3 +-
 2 files changed, 235 insertions(+), 14 deletions(-)

diff --git a/buildstream/storage/_casbaseddirectory.py b/buildstream/storage/_casbaseddirectory.py
index 286d672..69a3608 100644
--- a/buildstream/storage/_casbaseddirectory.py
+++ b/buildstream/storage/_casbaseddirectory.py
@@ -137,6 +137,41 @@ class CasBasedDirectory(Directory):
         # We don't need to do anything more than that; files were already added ealier, and symlinks are
         # part of the directory structure.
 
+    def _add_new_blank_directory(self, name) -> Directory:
+        bst_dir = CasBasedDirectory(self.context, parent=self, filename=name)
+        new_pb2_dirnode = self.pb2_directory.directories.add()
+        new_pb2_dirnode.name = name
+        # Calculate the hash for an empty directory
+        if name in self.index:
+            raise VirtualDirectoryError("Creating directory {} would overwrite an existing item in {}"
+                                        .format(name, str(self)))
+        new_pb2_directory = remote_execution_pb2.Directory()
+        self.cas_cache.add_object(digest=new_pb2_dirnode.digest, buffer=new_pb2_directory.SerializeToString())
+        self.index[name] = IndexEntry(new_pb2_dirnode, buildstream_object=bst_dir)
+        return bst_dir
+
+    def create_directory(self, name: str) -> Directory:
+        """Creates a directory if it does not already exist. This does not
+        cause an error if something exists; it will remove files and
+        symlinks to files which have the same name in this
+        directory. Symlinks to directories with the name 'name' are
+        unaltered; it's assumed that the target of that symlink will
+        be used.
+
+        """
+        existing_item = self._find_pb2_entry(name)
+        if isinstance(existing_item, remote_execution_pb2.FileNode):
+            # Directory imported over file with same name
+            self.remove_item(name)
+        elif isinstance(existing_item, remote_execution_pb2.SymlinkNode):
+            # Directory imported over symlink with same source name
+            if self.symlink_target_is_directory(existing_item):
+                return self._resolve_symlink_or_directory(name) # That's fine; any files in the source directory should end up at the target of the symlink.
+            else:
+                self.remove_item(name) # Symlinks to files get replaced
+        return self.descend(name, create=True) # Creates the directory if it doesn't already exist.
+
+
     def _find_pb2_entry(self, name):
         if name in self.index:
             return self.index[name].pb_object
@@ -233,6 +268,7 @@ class CasBasedDirectory(Directory):
             if isinstance(entry, CasBasedDirectory):
                 return entry.descend(subdirectory_spec[1:], create)
             else:
+                # May be a symlink
                 error = "Cannot descend into {}, which is a '{}' in the directory {}"
                 raise VirtualDirectoryError(error.format(subdirectory_spec[0],
                                                          type(entry).__name__,
@@ -289,6 +325,29 @@ class CasBasedDirectory(Directory):
                 directory = directory.descend(c, create=True)
         return directory
 
+    def _resolve_symlink(self, node):
+        """Same as _resolve_symlink_or_directory but takes a SymlinkNode.
+        """
+
+        # OK then, it's a symlink
+        symlink = node
+        absolute = symlink.target.startswith(CasBasedDirectory._pb2_absolute_path_prefix)
+        if absolute:
+            root = self.find_root()
+        else:
+            root = self
+        directory = root
+        components = symlink.target.split(CasBasedDirectory._pb2_path_sep)
+        for c in components:
+            if c == ".":
+                pass
+            elif c == "..":
+                directory = directory.parent
+            else:
+                directory = directory.descend(c, create=True)
+        return directory
+
+    
     def _resolve(self, name, absolute_symlinks_resolve=True):
         """ Resolves any name to an object. If the name points to a symlink in this 
         directory, it returns the thing it points to, recursively. Returns a CasBasedDirectory, FileNode or None. Never creates a directory or otherwise alters the directory. """
@@ -428,6 +487,157 @@ class CasBasedDirectory(Directory):
                     result.files_written.append(relative_pathname)
         return result
 
+
+    def _save(self, name):
+        """ Saves this directory into the content cache as a named ref. This function is not
+        currently in use, but may be useful later. """
+        self._recalculate_recursing_up()
+        self._recalculate_recursing_down()
+        (rel_refpath, refname) = os.path.split(name)
+        refdir = os.path.join(self.cas_directory, 'refs', 'heads', rel_refpath)
+        refname = os.path.join(refdir, refname)
+
+        if not os.path.exists(refdir):
+            os.makedirs(refdir)
+        with open(refname, "wb") as f:
+            f.write(self.ref.SerializeToString())
+
+    def find_updated_files(self, modified_directory, prefix=""):
+        """Find the list of written and overwritten files that would result
+        from importing 'modified_directory' into this one.  This does
+        not change either directory. The reason this exists is for
+        direct imports of cas directories into other ones, which can
+        be done by simply replacing a hash, but we still need the file
+        lists.
+
+        """
+        result = FileListResult()
+        for entry in modified_directory.pb2_directory.directories:
+            existing_dir = self._find_pb2_entry(entry.name)
+            if existing_dir:
+                updates_files = existing_dir.find_updated_files(modified_directory.descend(entry.name),
+                                                                os.path.join(prefix, entry.name))
+                result.combine(updated_files)
+            else:
+                for f in source_directory.descend(entry.name).list_relative_paths():
+                    result.files_written.append(os.path.join(prefix, f))
+                    # None of these can overwrite anything, since the original files don't exist
+        for entry in modified_directory.pb2_directory.files + modified_directory.pb2_directory.symlinks:
+            if self._find_pb2_entry(entry.name):
+                result.files_overwritten.apppend(os.path.join(prefix, entry.name))
+            result.file_written.apppend(os.path.join(prefix, entry.name))
+        return result
+
+    def files_in_subdir(sorted_files, dirname):
+        """Filters sorted_files and returns only the ones which have
+           'dirname' as a prefix, with that prefix removed.
+
+        """
+        if not dirname.endswith(os.path.sep):
+            dirname += os.path.sep
+        return [f[len(dirname):] for f in sorted_files if f.startswith(dirname)]
+
+    def symlink_target_is_directory(self, symlink_node):
+        x = self._resolve_symlink(symlink_node)
+        return isinstance(x, CasBasedDirectory)
+
+    def _partial_import_cas_into_cas(self, source_directory, files, path_prefix="", file_list_required=True):
+        """ Import only the files and symlinks listed in 'files' from source_directory to this one.
+        Args:
+           source_directory (:class:`.CasBasedDirectory`): The directory to import from
+           files ([str]): List of pathnames to import.
+           path_prefix (str): Prefix used to add entries to the file list result.
+           file_list_required: Whether to update the file list while processing.
+        """
+        print("Beginning partial import of {} into {}".format(source_directory, self))
+        result = FileListResult()
+        processed_directories = set()
+        for f in files:
+            if f == ".": continue
+            fullname = os.path.join(path_prefix, f)
+            components = f.split(os.path.sep)
+            if len(components)>1:
+                # We are importing a thing which is in a subdirectory. We may have already seen this dirname
+                # for a previous file.
+                dirname = components[0]
+                if dirname not in processed_directories:
+                    # Now strip off the first directory name and import files recursively.
+                    subcomponents = CasBasedDirectory.files_in_subdir(files, dirname)
+                    self.create_directory(dirname)
+                    print("Creating destination in {}: {}".format(self, dirname))
+                    dest_subdir = self._resolve_symlink_or_directory(dirname)
+                    src_subdir = source_directory.descend(dirname)
+                    import_result = dest_subdir._partial_import_cas_into_cas(src_subdir, subcomponents,
+                                                                             path_prefix=fullname, file_list_required=file_list_required)
+                    result.combine(import_result)
+                processed_directories.add(dirname)
+            elif isinstance(source_directory.index[f].buildstream_object, CasBasedDirectory):
+                # The thing in the input file list is a directory on its own. In which case, replace any existing file, or symlink to file
+                # with the new, blank directory - if it's neither of those things, or doesn't exist, then just create the dir.
+                self.create_directory(f)
+            else:
+                # We're importing a file or symlink - replace anything with the same name.
+                self._check_replacement(f, path_prefix, result)
+                item = source_directory.index[f].pb_object
+                if isinstance(item, remote_execution_pb2.FileNode):
+                    filenode = self.pb2_directory.files.add(digest=item.digest, name=f,
+                                                            is_executable=item.is_executable)
+                    self.index[f] = IndexEntry(filenode, modified=(fullname in result.overwritten))
+                else:
+                    assert(isinstance(item, remote_execution_pb2.SymlinkNode))
+                    symlinknode = self.pb2_directory.symlinks.add(name=f, target=item.target)
+                    # A symlink node has no digest.
+                    self.index[f] = IndexEntry(symlinknode, modified=(fullname in result.overwritten))
+        return result
+
+    def transfer_node_contents(destination, source):
+        """Transfers all fields from the source PB2 node into the
+        destination. Destination and source must be of the same type and must
+        be a FileNode, SymlinkNode or DirectoryNode.
+        """
+        assert(type(destination) == type(source))
+        destination.name = source.name
+        if isinstance(destination, remote_execution_pb2.FileNode):
+            destination.digest.hash = source.digest.hash
+            destination.digest.size_bytes = source.digest.size_bytes
+            destination.is_executable = source.is_executable
+        elif isinstance(destination, remote_execution_pb2.SymlinkNode):
+            destination.target = source.target
+        elif isinstance(destination, remote_execution_pb2.DirectoryNode):
+            destination.digest.hash = source.digest.hash
+            destination.digest.size_bytes = source.digest.size_bytes
+        else:
+            raise VirtualDirectoryError("Incompatible type '{}' used as destination for transfer_node_contents"
+                                        .format(destination.type))
+
+    def _add_directory_from_node(self, source_node, source_casdir, can_hardlink=False):
+        # Duplicate the given node and add it to our index with a CasBasedDirectory object.
+        # No existing entry with the source node's name can exist.
+        # source_casdir is only needed if can_hardlink is True.
+        assert(self._find_pb2_entry(source_node.name) is None)
+
+        if can_hardlink:
+            new_dir_node = self.pb2_directory.directories.add()
+            CasBasedDirectory.transfer_node_contents(new_dir_node, source_node)
+            self.index[source_node.name] = IndexEntry(source_node, buildstream_object=source_casdir, modified=True)
+        else:
+            new_dir_node = self.pb2_directory.directories.add()
+            CasBasedDirectory.transfer_node_contents(new_dir_node, source_node)
+            buildStreamDirectory = CasBasedDirectory(self.context, ref=source_node.digest,
+                                                     parent=self, filename=source_node.name)
+            self.index[source_node.name] = IndexEntry(source_node, buildstream_object=buildStreamDirectory, modified=True)
+
+    def _import_cas_into_cas(self, source_directory, files=None):
+        """ A full import is significantly quicker than a partial import, because we can just
+        replace one directory with another's hash, without doing any recursion.
+        """
+        if files is None:
+            #return self._full_import_cas_into_cas(source_directory, can_hardlink=True)
+            files = source_directory.list_relative_paths()
+            print("Extracted all files from source directory '{}': {}".format(source_directory, files))
+        return self._partial_import_cas_into_cas(source_directory, files)
+
+
     def import_files(self, external_pathspec, *, files=None,
                      report_written=True, update_utimes=False,
                      can_link=False):
@@ -449,28 +659,34 @@ class CasBasedDirectory(Directory):
 
         can_link (bool): Ignored, since hard links do not have any meaning within CAS.
         """
-        if isinstance(external_pathspec, FileBasedDirectory):
-            source_directory = external_pathspec._get_underlying_directory()
-        elif isinstance(external_pathspec, CasBasedDirectory):
-            # TODO: This transfers from one CAS to another via the
-            # filesystem, which is very inefficient. Alter this so it
-            # transfers refs across directly.
+
+        duplicate_cas = None
+        if isinstance(external_pathspec, CasBasedDirectory):
+            result = self._import_cas_into_cas(external_pathspec, files=files)
+
+            # Duplicate the current directory and do an import that way.
+            duplicate_cas = CasBasedDirectory(self.context, ref=self.ref)
             with tempfile.TemporaryDirectory(prefix="roundtrip") as tmpdir:
                 external_pathspec.export_files(tmpdir)
                 if files is None:
                     files = list_relative_paths(tmpdir)
-                result = self._import_files_from_directory(tmpdir, files=files)
-            return result
+                duplicate_cas._import_files_from_directory(tmpdir, files=files)
+                duplicate_cas._recalculate_recursing_down()
+                if duplicate_cas.parent:
+                    duplicate_cas.parent._recalculate_recursing_up(self)
         else:
-            source_directory = external_pathspec
-
-        if files is None:
-            files = list_relative_paths(source_directory)
+            if isinstance(external_pathspec, FileBasedDirectory):
+                source_directory = external_pathspec.get_underlying_directory()
+            else:
+                source_directory = external_pathspec
+            if files is None:
+                files = list_relative_paths(external_pathspec)
+            result = self._import_files_from_directory(source_directory, files=files)
 
         # TODO: No notice is taken of report_written, update_utimes or can_link.
         # Current behaviour is to fully populate the report, which is inefficient,
         # but still correct.
-        result = self._import_files_from_directory(source_directory, files=files)
+
 
         # We need to recalculate and store the hashes of all directories both
         # up and down the tree; we have changed our directory by importing files
@@ -480,6 +696,10 @@ class CasBasedDirectory(Directory):
         self._recalculate_recursing_down()
         if self.parent:
             self.parent._recalculate_recursing_up(self)
+        if duplicate_cas:
+            if duplicate_cas.ref.hash != self.ref.hash:
+                raise VirtualDirectoryError("Mismatch between file-imported result {} and cas-to-cas imported result {}.".format(duplicate_cas.ref.hash,self.ref.hash))
+
         return result
 
     def set_deterministic_mtime(self):
diff --git a/tests/storage/virtual_directory_import.py b/tests/storage/virtual_directory_import.py
index 1c78c1b..47b4935 100644
--- a/tests/storage/virtual_directory_import.py
+++ b/tests/storage/virtual_directory_import.py
@@ -150,9 +150,10 @@ def test_cas_import(cli, tmpdir, original, overlay):
     generate_random_root(tmpdir)
     d = create_new_casdir(original, fake_context, tmpdir)
     d2 = create_new_casdir(overlay, fake_context, tmpdir)
+    print("Importing dir {} into {}".format(overlay, original))
     d.import_files(d2)
     d.export_files(os.path.join(tmpdir, "output"))
-
+    
     for item in root_filesets[overlay - 1]:
         (path, typename, content) = item
         realpath = resolve_symlinks(path, os.path.join(tmpdir, "output"))


[buildstream] 36/43: Remove create_directory

Posted by gi...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a commit to branch jmac/cas_to_cas_oct_v2
in repository https://gitbox.apache.org/repos/asf/buildstream.git

commit c74845da7bcac6a4ee100094b1a713551b508cb9
Author: Jim MacArthur <ji...@codethink.co.uk>
AuthorDate: Tue Oct 30 10:25:51 2018 +0000

    Remove create_directory
---
 buildstream/storage/_casbaseddirectory.py | 23 +----------------------
 1 file changed, 1 insertion(+), 22 deletions(-)

diff --git a/buildstream/storage/_casbaseddirectory.py b/buildstream/storage/_casbaseddirectory.py
index 092cc52..75184d2 100644
--- a/buildstream/storage/_casbaseddirectory.py
+++ b/buildstream/storage/_casbaseddirectory.py
@@ -141,27 +141,6 @@ class CasBasedDirectory(Directory):
         # We don't need to do anything more than that; files were already added ealier, and symlinks are
         # part of the directory structure.
 
-    def create_directory(self, name: str) -> Directory:
-        """Creates a directory if it does not already exist. This does not
-        cause an error if something exists; it will remove files and
-        symlinks to files which have the same name in this
-        directory. Symlinks to directories with the name 'name' are
-        unaltered; it's assumed that the target of that symlink will
-        be used.
-
-        """
-        existing_item = self._find_pb2_entry(name)
-        if isinstance(existing_item, remote_execution_pb2.FileNode):
-            # Directory imported over file with same name
-            self.delete_entry(name)
-        elif isinstance(existing_item, remote_execution_pb2.SymlinkNode):
-            # Directory imported over symlink with same source name
-            if self._symlink_target_is_directory(existing_item):
-                return self._force_resolve(name) # That's fine; any files in the source directory should end up at the target of the symlink.
-            else:
-                self.delete_entry(name) # Symlinks to files get replaced
-        return self.descend(name, create=True) # Creates the directory if it doesn't already exist.
-
 
     def _find_pb2_entry(self, name):
         if name in self.index:
@@ -563,7 +542,7 @@ class CasBasedDirectory(Directory):
                         # There's either a symlink (valid or not) or existing directory with this name, so do nothing.
                         pass
                 else:
-                    self.create_directory(f)                    
+                    self.descend(f, create=True)
             else:
                 # We're importing a file or symlink - replace anything with the same name.
                 print("Import of file/symlink {} into this directory. Removing anything existing...".format(f))


[buildstream] 02/43: tests/storage/virtual_directory_import.py: New file

Posted by gi...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a commit to branch jmac/cas_to_cas_oct_v2
in repository https://gitbox.apache.org/repos/asf/buildstream.git

commit 6aea589c3076ee217a8c3e348a8ae0722452b7f7
Author: Jim MacArthur <ji...@codethink.co.uk>
AuthorDate: Thu Sep 20 18:32:40 2018 +0100

    tests/storage/virtual_directory_import.py: New file
---
 tests/storage/virtual_directory_import.py | 137 ++++++++++++++++++++++++++++++
 1 file changed, 137 insertions(+)

diff --git a/tests/storage/virtual_directory_import.py b/tests/storage/virtual_directory_import.py
new file mode 100644
index 0000000..b0dee41
--- /dev/null
+++ b/tests/storage/virtual_directory_import.py
@@ -0,0 +1,137 @@
+import os
+import pytest
+from tests.testutils import cli
+
+from buildstream.storage import CasBasedDirectory
+
+
+class FakeContext():
+    def __init__(self):
+        self.config_cache_quota = "65536"
+
+    def get_projects(self):
+        return []
+
+# This is a set of example file system contents. The test attempts to import
+# each on top of each other to test importing works consistently.
+# Each tuple is defined as (<filename>, <type>, <content>). Type can be
+# 'F' (file), 'S' (symlink) or 'D' (directory) with content being the contents
+# for a file or the destination for a symlink.
+root_filesets = [
+    [('a/b/c/textfile1', 'F', 'This is textfile 1\n')],
+    [('a/b/c/textfile1', 'F', 'This is the replacement textfile 1\n')],
+    [('a/b/d', 'D', '')],
+    [('a/b/c', 'S', '/a/b/d')],
+    [('a/b/d', 'D', ''), ('a/b/c', 'S', '/a/b/d')],
+]
+
+empty_hash_ref = "e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855"
+
+
+def generate_import_roots(directory):
+    for fileset in range(1, len(root_filesets) + 1):
+        rootname = "root{}".format(fileset)
+        rootdir = os.path.join(directory, "content", rootname)
+
+        for (path, typesymbol, content) in root_filesets[fileset - 1]:
+            if typesymbol == 'F':
+                (dirnames, filename) = os.path.split(path)
+                os.makedirs(os.path.join(rootdir, dirnames), exist_ok=True)
+                with open(os.path.join(rootdir, dirnames, filename), "wt") as f:
+                    f.write(content)
+            elif typesymbol == 'D':
+                os.makedirs(os.path.join(rootdir, path), exist_ok=True)
+            elif typesymbol == 'S':
+                (dirnames, filename) = os.path.split(path)
+                os.makedirs(os.path.join(rootdir, dirnames), exist_ok=True)
+                os.symlink(content, os.path.join(rootdir, path))
+
+
+def file_contents(path):
+    with open(path, "r") as f:
+        result = f.read()
+    return result
+
+
+def file_contents_are(path, contents):
+    return file_contents(path) == contents
+
+
+def create_new_vdir(root_number, fake_context, tmpdir):
+    d = CasBasedDirectory(fake_context)
+    d.import_files(os.path.join(tmpdir, "content", "root{}".format(root_number)))
+    assert d.ref.hash != empty_hash_ref
+    return d
+
+
+def combinations(integer_range):
+    for x in integer_range:
+        for y in integer_range:
+            yield (x, y)
+
+
+def resolve_symlinks(path, root):
+    """ A function to resolve symlinks inside 'path' components apart from the last one.
+        For example, resolve_symlinks('/a/b/c/d', '/a/b')
+        will return '/a/b/f/d' if /a/b/c is a symlink to /a/b/f. The final component of
+        'path' is not resolved, because we typically want to inspect the symlink found
+        at that path, not its target.
+
+    """
+    components = path.split(os.path.sep)
+    location = root
+    for i in range(0, len(components) - 1):
+        location = os.path.join(location, components[i])
+        if os.path.islink(location):
+            # Resolve the link, add on all the remaining components
+            target = os.path.join(os.readlink(location))
+            tail = os.path.sep.join(components[i + 1:])
+
+            if target.startswith(os.path.sep):
+                # Absolute link - relative to root
+                location = os.path.join(root, target, tail)
+            else:
+                # Relative link - relative to symlink location
+                location = os.path.join(location, target)
+            return resolve_symlinks(location, root)
+    # If we got here, no symlinks were found. Add on the final component and return.
+    location = os.path.join(location, components[-1])
+    return location
+
+
+def directory_not_empty(path):
+    return os.listdir(path)
+
+
+@pytest.mark.parametrize("original,overlay", combinations([1, 2, 3, 4, 5]))
+def test_cas_import(cli, tmpdir, original, overlay):
+    fake_context = FakeContext()
+    fake_context.artifactdir = tmpdir
+    # Create some fake content
+    generate_import_roots(tmpdir)
+
+    d = create_new_vdir(original, fake_context, tmpdir)
+    d2 = create_new_vdir(overlay, fake_context, tmpdir)
+    d.import_files(d2)
+    d.export_files(os.path.join(tmpdir, "output"))
+
+    for item in root_filesets[overlay - 1]:
+        (path, typename, content) = item
+        realpath = resolve_symlinks(path, os.path.join(tmpdir, "output"))
+        if typename == 'F':
+            if os.path.isdir(realpath) and directory_not_empty(realpath):
+                # The file should not have overwritten the directory in this case.
+                pass
+            else:
+                assert os.path.isfile(realpath), "{} did not exist in the combined virtual directory".format(path)
+                assert file_contents_are(realpath, content)
+        elif typename == 'S':
+            if os.path.isdir(realpath) and directory_not_empty(realpath):
+                # The symlink should not have overwritten the directory in this case.
+                pass
+            else:
+                assert os.path.islink(realpath)
+                assert os.readlink(realpath) == content
+        elif typename == 'D':
+            # Note that isdir accepts symlinks to dirs, so a symlink to a dir is acceptable.
+            assert os.path.isdir(realpath)


[buildstream] 04/43: Tests: add test_directory_listing

Posted by gi...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a commit to branch jmac/cas_to_cas_oct_v2
in repository https://gitbox.apache.org/repos/asf/buildstream.git

commit 1700322c8fefd5f08f9d9e20d775a6621099585c
Author: Jim MacArthur <ji...@codethink.co.uk>
AuthorDate: Fri Sep 28 16:33:42 2018 +0100

    Tests: add test_directory_listing
---
 tests/storage/virtual_directory_import.py | 34 ++++++++++++++++++++++++++++---
 1 file changed, 31 insertions(+), 3 deletions(-)

diff --git a/tests/storage/virtual_directory_import.py b/tests/storage/virtual_directory_import.py
index b0dee41..b76fef7 100644
--- a/tests/storage/virtual_directory_import.py
+++ b/tests/storage/virtual_directory_import.py
@@ -3,6 +3,7 @@ import pytest
 from tests.testutils import cli
 
 from buildstream.storage import CasBasedDirectory
+from buildstream.storage import FileBasedDirectory
 
 
 class FakeContext():
@@ -57,12 +58,19 @@ def file_contents_are(path, contents):
     return file_contents(path) == contents
 
 
-def create_new_vdir(root_number, fake_context, tmpdir):
+def create_new_casdir(root_number, fake_context, tmpdir):
     d = CasBasedDirectory(fake_context)
     d.import_files(os.path.join(tmpdir, "content", "root{}".format(root_number)))
     assert d.ref.hash != empty_hash_ref
     return d
 
+def create_new_filedir(root_number, tmpdir):
+    root = os.path.join(tmpdir, "vdir")
+    os.makedirs(root)
+    d = FileBasedDirectory(root)
+    d.import_files(os.path.join(tmpdir, "content", "root{}".format(root_number)))
+    return d
+
 
 def combinations(integer_range):
     for x in integer_range:
@@ -110,8 +118,8 @@ def test_cas_import(cli, tmpdir, original, overlay):
     # Create some fake content
     generate_import_roots(tmpdir)
 
-    d = create_new_vdir(original, fake_context, tmpdir)
-    d2 = create_new_vdir(overlay, fake_context, tmpdir)
+    d = create_new_casdir(original, fake_context, tmpdir)
+    d2 = create_new_casdir(overlay, fake_context, tmpdir)
     d.import_files(d2)
     d.export_files(os.path.join(tmpdir, "output"))
 
@@ -135,3 +143,23 @@ def test_cas_import(cli, tmpdir, original, overlay):
         elif typename == 'D':
             # Note that isdir accepts symlinks to dirs, so a symlink to a dir is acceptable.
             assert os.path.isdir(realpath)
+
+
+@pytest.mark.parametrize("root", [1, 2, 3, 4, 5])
+def test_directory_listing(cli, tmpdir, root):
+    fake_context = FakeContext()
+    fake_context.artifactdir = tmpdir
+    # Create some fake content
+    generate_import_roots(tmpdir)
+
+    d = create_new_filedir(root, tmpdir)
+    filelist = list(d.list_relative_paths())
+
+    d2 = create_new_casdir(root, fake_context, tmpdir)
+    filelist2 = list(d2.list_relative_paths())
+
+    print("filelist for root {} via FileBasedDirectory:".format(root))
+    print("{}".format(filelist))
+    print("filelist for root {} via CasBasedDirectory:".format(root))
+    print("{}".format(filelist2))
+    assert(filelist==filelist2)


[buildstream] 21/43: Separation of fixed/random tests in virtual_directory_import

Posted by gi...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a commit to branch jmac/cas_to_cas_oct_v2
in repository https://gitbox.apache.org/repos/asf/buildstream.git

commit 4a892faf65e295d8976eca9ebe64995819c425f4
Author: Jim MacArthur <ji...@codethink.co.uk>
AuthorDate: Wed Oct 24 19:01:36 2018 +0100

    Separation of fixed/random tests in virtual_directory_import
---
 tests/storage/virtual_directory_import.py | 131 ++++++++++++++++--------------
 1 file changed, 72 insertions(+), 59 deletions(-)

diff --git a/tests/storage/virtual_directory_import.py b/tests/storage/virtual_directory_import.py
index 4754800..dfe3580 100644
--- a/tests/storage/virtual_directory_import.py
+++ b/tests/storage/virtual_directory_import.py
@@ -31,56 +31,54 @@ empty_hash_ref = "e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b8
 RANDOM_SEED = 69105
 
 
-def generate_import_roots(directory):
-    for fileset in range(1, len(root_filesets) + 1):
-        rootname = "root{}".format(fileset)
-        rootdir = os.path.join(directory, "content", rootname)
-
-        for (path, typesymbol, content) in root_filesets[fileset - 1]:
-            if typesymbol == 'F':
-                (dirnames, filename) = os.path.split(path)
-                os.makedirs(os.path.join(rootdir, dirnames), exist_ok=True)
-                with open(os.path.join(rootdir, dirnames, filename), "wt") as f:
-                    f.write(content)
-            elif typesymbol == 'D':
-                os.makedirs(os.path.join(rootdir, path), exist_ok=True)
-            elif typesymbol == 'S':
-                (dirnames, filename) = os.path.split(path)
-                os.makedirs(os.path.join(rootdir, dirnames), exist_ok=True)
-                os.symlink(content, os.path.join(rootdir, path))
-
-
-def generate_random_roots(directory):
-    random.seed(RANDOM_SEED)
-    for rootno in range(6,21):
-        rootname = "root{}".format(rootno)
-        rootdir = os.path.join(directory, "content", rootname)
-        things = []
-        locations = ['.']
-        os.makedirs(rootdir)
-        for i in range(0, 100):
-            location = random.choice(locations)
-            thingname = "node{}".format(i)
-            thing = random.choice(['dir', 'link', 'file'])
-            target = os.path.join(rootdir, location, thingname)
-            description = thing
-            if thing == 'dir':
-                os.makedirs(target)
-                locations.append(os.path.join(location, thingname))
-            elif thing == 'file':
-                with open(target, "wt") as f:
-                    f.write("This is node {}\n".format(i))
-            elif thing == 'link':
-                # TODO: Make some relative symlinks
-                if random.randint(1, 3) == 1 or len(things) == 0:
-                    os.symlink("/broken", target)
-                    description = "symlink pointing to /broken"
-                else:
-                    symlink_destination = random.choice(things)
-                    os.symlink(symlink_destination, target)
-                    description = "symlink pointing to {}".format(symlink_destination)
-            things.append(os.path.join(location, thingname))
-            print("Generated {}/{}, a {}".format(rootdir, things[-1], description))
+def generate_import_roots(rootno, directory):
+    rootname = "root{}".format(rootno)
+    rootdir = os.path.join(directory, "content", rootname)
+
+    for (path, typesymbol, content) in root_filesets[rootno - 1]:
+        if typesymbol == 'F':
+            (dirnames, filename) = os.path.split(path)
+            os.makedirs(os.path.join(rootdir, dirnames), exist_ok=True)
+            with open(os.path.join(rootdir, dirnames, filename), "wt") as f:
+                f.write(content)
+        elif typesymbol == 'D':
+            os.makedirs(os.path.join(rootdir, path), exist_ok=True)
+        elif typesymbol == 'S':
+            (dirnames, filename) = os.path.split(path)
+            os.makedirs(os.path.join(rootdir, dirnames), exist_ok=True)
+            os.symlink(content, os.path.join(rootdir, path))
+
+
+def generate_random_root(rootno, directory):
+    random.seed(RANDOM_SEED+rootno)
+    rootname = "root{}".format(rootno)
+    rootdir = os.path.join(directory, "content", rootname)
+    things = []
+    locations = ['.']
+    os.makedirs(rootdir)
+    for i in range(0, 100):
+        location = random.choice(locations)
+        thingname = "node{}".format(i)
+        thing = random.choice(['dir', 'link', 'file'])
+        target = os.path.join(rootdir, location, thingname)
+        description = thing
+        if thing == 'dir':
+            os.makedirs(target)
+            locations.append(os.path.join(location, thingname))
+        elif thing == 'file':
+            with open(target, "wt") as f:
+                f.write("This is node {}\n".format(i))
+        elif thing == 'link':
+            # TODO: Make some relative symlinks
+            if random.randint(1, 3) == 1 or len(things) == 0:
+                os.symlink("/broken", target)
+                description = "symlink pointing to /broken"
+            else:
+                symlink_destination = random.choice(things)
+                os.symlink(symlink_destination, target)
+                description = "symlink pointing to {}".format(symlink_destination)
+        things.append(os.path.join(location, thingname))
+        print("Generated {}/{}, a {}".format(rootdir, things[-1], description))
 
 
 def file_contents(path):
@@ -147,20 +145,21 @@ def directory_not_empty(path):
     return os.listdir(path)
 
 
-@pytest.mark.parametrize("original,overlay", combinations(range(1,21)))
-def test_cas_import(cli, tmpdir, original, overlay):
+def _import_test(tmpdir, original, overlay, generator_function, verify_contents=False):
     fake_context = FakeContext()
     fake_context.artifactdir = tmpdir
     # Create some fake content
-    generate_import_roots(tmpdir)
-    generate_random_roots(tmpdir)
+    generator_function(original, tmpdir)
+    if original != overlay:
+        generator_function(overlay, tmpdir)
+        
     d = create_new_casdir(original, fake_context, tmpdir)
     d2 = create_new_casdir(overlay, fake_context, tmpdir)
     print("Importing dir {} into {}".format(overlay, original))
     d.import_files(d2)
     d.export_files(os.path.join(tmpdir, "output"))
     
-    if overlay < 6:
+    if verify_contents:
         for item in root_filesets[overlay - 1]:
             (path, typename, content) = item
             realpath = resolve_symlinks(path, os.path.join(tmpdir, "output"))
@@ -188,14 +187,19 @@ def test_cas_import(cli, tmpdir, original, overlay):
     d3.import_files(d2)
     assert d.ref.hash == d3.ref.hash
 
+@pytest.mark.parametrize("original,overlay", combinations(range(1,6)))
+def test_fixed_cas_import(cli, tmpdir, original, overlay):
+    _import_test(tmpdir, original, overlay, generate_import_roots, verify_contents=True)
+
+@pytest.mark.parametrize("original,overlay", combinations(range(1,11)))
+def test_random_cas_import(cli, tmpdir, original, overlay):
+    _import_test(tmpdir, original, overlay, generate_random_root, verify_contents=False)
 
-@pytest.mark.parametrize("root", [1, 2, 3, 4, 5, 6])
-def test_directory_listing(cli, tmpdir, root):
+def _listing_test(tmpdir, root, generator_function):
     fake_context = FakeContext()
     fake_context.artifactdir = tmpdir
     # Create some fake content
-    generate_import_roots(tmpdir)
-    generate_random_roots(tmpdir)
+    generator_function(root, tmpdir)
 
     d = create_new_filedir(root, tmpdir)
     filelist = list(d.list_relative_paths())
@@ -208,3 +212,12 @@ def test_directory_listing(cli, tmpdir, root):
     print("filelist for root {} via CasBasedDirectory:".format(root))
     print("{}".format(filelist2))
     assert filelist == filelist2
+    
+
+@pytest.mark.parametrize("root", range(1,11))
+def test_random_directory_listing(cli, tmpdir, root):
+    _listing_test(tmpdir, root, generate_random_root)
+    
+@pytest.mark.parametrize("root", [1, 2, 3, 4, 5])
+def test_fixed_directory_listing(cli, tmpdir, root):
+    _listing_test(tmpdir, root, generate_import_roots)


[buildstream] 03/43: CasBasedDirectory: New way of doing list_relative_paths - not working yet

Posted by gi...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a commit to branch jmac/cas_to_cas_oct_v2
in repository https://gitbox.apache.org/repos/asf/buildstream.git

commit d8cc12d7e2236efc268f5e49695ac7b0cc0c6c31
Author: Jim MacArthur <ji...@codethink.co.uk>
AuthorDate: Fri Sep 28 16:33:15 2018 +0100

    CasBasedDirectory: New way of doing list_relative_paths - not working yet
---
 buildstream/storage/_casbaseddirectory.py | 34 +++++++++++++++++++++++--------
 1 file changed, 26 insertions(+), 8 deletions(-)

diff --git a/buildstream/storage/_casbaseddirectory.py b/buildstream/storage/_casbaseddirectory.py
index 07fd206..e5c83c5 100644
--- a/buildstream/storage/_casbaseddirectory.py
+++ b/buildstream/storage/_casbaseddirectory.py
@@ -526,7 +526,13 @@ class CasBasedDirectory(Directory):
                 filelist.append(k)
         return filelist
 
-    def list_relative_paths(self):
+    def _contains_only_directories(self):
+        for (k, v) in self.index.items():
+            if not isinstance(v.buildstream_object, CasBasedDirectory):
+                return False
+        return True
+
+    def list_relative_paths(self, relpath=""):
         """Provide a list of all relative paths.
 
         NOTE: This list is not in the same order as utils.list_relative_paths.
@@ -534,13 +540,25 @@ class CasBasedDirectory(Directory):
         Return value: List(str) - list of all paths
         """
 
-        filelist = []
-        for (k, v) in self.index.items():
-            if isinstance(v.buildstream_object, CasBasedDirectory):
-                filelist.extend([k + os.path.sep + x for x in v.buildstream_object.list_relative_paths()])
-            elif isinstance(v.pb_object, remote_execution_pb2.FileNode):
-                filelist.append(k)
-        return filelist
+        print("Running list_relative_paths on relpath {}".format(relpath))
+        symlink_list = filter(lambda i: isinstance(i[1].pb_object, remote_execution_pb2.SymlinkNode), self.index.items())
+        file_list = filter(lambda i: isinstance(i[1].pb_object, remote_execution_pb2.FileNode), self.index.items())
+        print("Running list_relative_paths on relpath {}. files={}, symlinks={}".format(relpath, [f[0] for f in file_list], [s[0] for s in symlink_list]))
+
+        for (k, v) in sorted(symlink_list):
+            print("Yielding symlink {}".format(k))
+            yield os.path.join(relpath, k)
+        for (k, v) in sorted(file_list):
+            print("Yielding file {}".format(k))
+            yield os.path.join(relpath, k)
+        else:
+            print("Yielding empty directory name {}".format(relpath))
+            yield relpath
+
+        directory_list = filter(lambda i: isinstance(i[1].buildstream_object, CasBasedDirectory), self.index.items())
+        for (k, v) in sorted(directory_list):
+            print("Yielding from subdirectory name {}".format(k))
+            yield from v.buildstream_object.list_relative_paths(relpath=os.path.join(relpath, k))
 
     def recalculate_hash(self):
         """ Recalcuates the hash for this directory and store the results in


[buildstream] 29/43: _casbaseddirectory.py: Remove some unnecessary things, label others

Posted by gi...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a commit to branch jmac/cas_to_cas_oct_v2
in repository https://gitbox.apache.org/repos/asf/buildstream.git

commit 8205b88cd229b0969e4743fc3831bf6314463f42
Author: Jim MacArthur <ji...@codethink.co.uk>
AuthorDate: Thu Oct 25 17:49:45 2018 +0100

    _casbaseddirectory.py: Remove some unnecessary things, label others
---
 buildstream/storage/_casbaseddirectory.py | 33 ++++---------------------------
 1 file changed, 4 insertions(+), 29 deletions(-)

diff --git a/buildstream/storage/_casbaseddirectory.py b/buildstream/storage/_casbaseddirectory.py
index e30a23d..64f867b 100644
--- a/buildstream/storage/_casbaseddirectory.py
+++ b/buildstream/storage/_casbaseddirectory.py
@@ -614,17 +614,6 @@ class CasBasedDirectory(Directory):
         x = self._resolve_symlink(symlink_node, force_create=False)
         return isinstance(x, CasBasedDirectory)
 
-    def _verify_unique(self):
-        # Verifies that there are no duplicate names in this directory or subdirectories.
-        names = []
-        for entrylist in [self.pb2_directory.files, self.pb2_directory.directories, self.pb2_directory.symlinks]:
-            for e in entrylist:
-                if e.name in names:
-                    raise VirtualDirectoryError("Duplicate entry for name {} found".format(e.name))
-                names.append(e.name)
-        for d in self.pb2_directory.directories:
-            self.index[d.name].buildstream_object._verify_unique()
-    
     def _partial_import_cas_into_cas(self, source_directory, files, path_prefix="", file_list_required=True):
         """ Import only the files and symlinks listed in 'files' from source_directory to this one.
         Args:
@@ -633,11 +622,9 @@ class CasBasedDirectory(Directory):
            path_prefix (str): Prefix used to add entries to the file list result.
            file_list_required: Whether to update the file list while processing.
         """
-        print("Beginning partial import of {} into {}. Files are: >{}<".format(source_directory, self, ", ".join(files)))
         result = FileListResult()
         processed_directories = set()
         for f in files:
-            #if f == ".": continue
             fullname = os.path.join(path_prefix, f)
             components = f.split(os.path.sep)
             if len(components)>1:
@@ -657,12 +644,9 @@ class CasBasedDirectory(Directory):
                         else:
                             dest_subdir = x
                     else:
-                        print("Importing {}: {} does not exist in {}, so it is created as a directory".format(f, dirname, self))
-                        
                         self.create_directory(dirname)
                         dest_subdir = self._resolve_symlink_or_directory(dirname)
                     src_subdir = source_directory.descend(dirname)
-                    print("Now recursing into {} to continue adding {}".format(src_subdir, f))
                     import_result = dest_subdir._partial_import_cas_into_cas(src_subdir, subcomponents,
                                                                              path_prefix=fullname, file_list_required=file_list_required)
                     result.combine(import_result)
@@ -741,12 +725,12 @@ class CasBasedDirectory(Directory):
         replace one directory with another's hash, without doing any recursion.
         """
         if files is None:
-            #return self._full_import_cas_into_cas(source_directory, can_hardlink=True)
-            files = list(source_directory.list_relative_paths())
-            print("Extracted all files from source directory '{}': {}".format(source_directory, files))
+            files = source_directory.list_relative_paths()
+        # You must pass a list into _partial_import (not a generator)
         return self._partial_import_cas_into_cas(source_directory, list(files))
 
     def _describe(self, thing):
+        """ Only used by showdiff, and as such, not called """
         # Describes protocol buffer objects
         if isinstance(thing, remote_execution_pb2.DirectoryNode):
             return "directory called {}".format(thing.name)
@@ -757,10 +741,8 @@ class CasBasedDirectory(Directory):
         else:
             return "strange thing"
         
-    
     def showdiff(self, other):
-        print("Diffing {} and {}:".format(self, other))
-
+        """ An old function used to show differences between two directories. No longer in use. """
         def compare_list(l1, l2, name):
             item2 = None
             index = 0
@@ -835,16 +817,9 @@ class CasBasedDirectory(Directory):
 
         print("Directory before import: {}".format(self.show_files_recursive()))
 
-        # Sync self (necessary?)
-        self._recalculate_recursing_down()
-        if self.parent:
-            self.parent._recalculate_recursing_up(self)
-        
-        self._verify_unique()
         if isinstance(external_pathspec, CasBasedDirectory):
             print("-"*80 + "Performing direct CAS-to-CAS import")
             result = self._import_cas_into_cas(external_pathspec, files=files)
-            self._verify_unique()
             print("Result of cas-to-cas import: {}".format(self.show_files_recursive()))
         else:
             print("-"*80 + "Performing initial import")


[buildstream] 20/43: CAS-to-CAS: Now passing all 20x20 tests

Posted by gi...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a commit to branch jmac/cas_to_cas_oct_v2
in repository https://gitbox.apache.org/repos/asf/buildstream.git

commit 7677f3fd60a887f30b44ccfd632d3bdef61a3cd3
Author: Jim MacArthur <ji...@codethink.co.uk>
AuthorDate: Wed Oct 24 18:28:39 2018 +0100

    CAS-to-CAS: Now passing all 20x20 tests
---
 buildstream/storage/_casbaseddirectory.py | 70 +++++++++++++++++++++++--------
 tests/storage/virtual_directory_import.py | 14 ++++---
 2 files changed, 61 insertions(+), 23 deletions(-)

diff --git a/buildstream/storage/_casbaseddirectory.py b/buildstream/storage/_casbaseddirectory.py
index a7ace32..d00cdd6 100644
--- a/buildstream/storage/_casbaseddirectory.py
+++ b/buildstream/storage/_casbaseddirectory.py
@@ -299,6 +299,9 @@ class CasBasedDirectory(Directory):
         else:
             if create:
                 newdir = self._add_directory(subdirectory_spec[0])
+                print("Created new directory called {} and descending into it".format(subdirectory_spec[0]))
+                #if subdirectory_spec[0] == "broken":
+                #    assert False
                 return newdir.descend(subdirectory_spec[1:], create)
             else:
                 error = "No entry called '{}' found in {}. There are directories called {}."
@@ -359,7 +362,7 @@ class CasBasedDirectory(Directory):
         print("Is {} followable? Resolved to {}".format(name, target))
         return isinstance(target, CasBasedDirectory) or target is None
 
-    def _resolve_symlink(self, node):
+    def _resolve_symlink(self, node, force_create=True):
         """Same as _resolve_symlink_or_directory but takes a SymlinkNode.
         """
 
@@ -378,7 +381,10 @@ class CasBasedDirectory(Directory):
             elif c == "..":
                 directory = directory.parent
             else:
-                directory = directory.descend(c, create=True)
+                if c in directory.index or force_create:
+                    directory = directory.descend(c, create=True)
+                else:
+                    return None
         return directory
 
     
@@ -401,6 +407,7 @@ class CasBasedDirectory(Directory):
             return index_entry.pb_object
         
         assert isinstance(index_entry.pb_object, remote_execution_pb2.SymlinkNode)
+        print("Resolving '{}': This is a symlink node in the current directory.".format(name))
         symlink = index_entry.pb_object
         components = symlink.target.split(CasBasedDirectory._pb2_path_sep)
 
@@ -444,7 +451,7 @@ class CasBasedDirectory(Directory):
                         print("  resolving {}: file/broken link".format(c))
                         if f is None and force_create:
                             print("Creating target of broken link {}".format(c))
-                            return directory.descend(c, create=True)
+                            directory = directory.descend(c, create=True)
                         elif components:
                             # Oh dear. We have components left to resolve, but the one we're trying to resolve points to a file.
                             raise VirtualDirectoryError("Reached a file called {} while trying to resolve a symlink; cannot proceed".format(c))
@@ -454,7 +461,7 @@ class CasBasedDirectory(Directory):
                     print("  resolving {}: Non-existent file; must be from a broken symlink.".format(c))
                     if force_create:
                         print("Creating target of broken link {} (2)".format(c))
-                        return directory.descend(c, create=True)
+                        directory = directory.descend(c, create=True)
                     else:
                         return None
 
@@ -529,6 +536,8 @@ class CasBasedDirectory(Directory):
                 directory_name = split_path[0]
                 # Hand this off to the importer for that subdir. This will only do one file -
                 # a better way would be to hand off all the files in this subdir at once.
+                # failed here because directory_name didn't point to a directory...
+                print("Attempting to import into {} from {}".format(directory_name, source_directory))
                 subdir_result = self._import_directory_recursively(directory_name, source_directory,
                                                                    split_path[1:], path_prefix)
                 result.combine(subdir_result)
@@ -599,7 +608,7 @@ class CasBasedDirectory(Directory):
         return [f[len(dirname):] for f in sorted_files if f.startswith(dirname)]
 
     def symlink_target_is_directory(self, symlink_node):
-        x = self._resolve_symlink(symlink_node)
+        x = self._resolve_symlink(symlink_node, force_create=False)
         return isinstance(x, CasBasedDirectory)
 
     def _verify_unique(self):
@@ -637,14 +646,20 @@ class CasBasedDirectory(Directory):
                     subcomponents = CasBasedDirectory.files_in_subdir(files, dirname)
                     # We will fail at this point if there is a file or symlink to file called 'dirname'.
                     if dirname in self.index:
-                        x = self._resolve(dirname)
+                        x = self._resolve(dirname, force_create=True)
                         if isinstance(x, remote_execution_pb2.FileNode):
                             self.delete_entry(dirname)
                             result.overwritten.append(f)
-                    self.create_directory(dirname)
-                    print("Creating destination in {}: {}".format(self, dirname))
-                    dest_subdir = self._resolve_symlink_or_directory(dirname)
+                            dest_subdir = self.descend(dirname, create=True)
+                        else:
+                            dest_subdir = x
+                    else:
+                        print("Importing {}: {} does not exist in {}, so it is created as a directory".format(f, dirname, self))
+                        
+                        self.create_directory(dirname)
+                        dest_subdir = self._resolve_symlink_or_directory(dirname)
                     src_subdir = source_directory.descend(dirname)
+                    print("Now recursing into {} to continue adding {}".format(src_subdir, f))
                     import_result = dest_subdir._partial_import_cas_into_cas(src_subdir, subcomponents,
                                                                              path_prefix=fullname, file_list_required=file_list_required)
                     result.combine(import_result)
@@ -652,7 +667,19 @@ class CasBasedDirectory(Directory):
             elif isinstance(source_directory.index[f].buildstream_object, CasBasedDirectory):
                 # The thing in the input file list is a directory on its own. In which case, replace any existing file, or symlink to file
                 # with the new, blank directory - if it's neither of those things, or doesn't exist, then just create the dir.
-                self.create_directory(f)
+                if f in self.index:
+                    x = self._resolve(f)
+                    if x is None:
+                        # If we're importing a blank directory, and the target has a broken symlink, then do nothing.
+                        pass
+                    elif isinstance(x, remote_execution_pb2.FileNode):
+                        # Files with the same name, or symlinks to files, get removed.
+                        pass
+                    else:
+                        # There's either a symlink (valid or not) or existing directory with this name, so do nothing.
+                        pass
+                else:
+                    self.create_directory(f)                    
             else:
                 # We're importing a file or symlink - replace anything with the same name.
                 print("Import of file/symlink {} into this directory. Removing anything existing...".format(f))
@@ -737,18 +764,22 @@ class CasBasedDirectory(Directory):
     def showdiff(self, other):
         print("Diffing {} and {}:".format(self, other))
 
-        def compare_list(l1, l2):
+        def compare_list(l1, l2, name):
             item2 = None
             index = 0
-            print("Comparing lists: {} vs {}".format([d.name for d in l1], [d.name for d in l2]))
+            print("Comparing {} lists: {} vs {}".format(name, [d.name for d in l1], [d.name for d in l2]))
             for item1 in l1:
                 if index>=len(l2):
                     print("l2 is short: no item to correspond to '{}' in l1.".format(item1.name))
                     return False
                 item2 = l2[index]
                 if item1.name != item2.name:
-                    print("Items do not match: {}, a {} in l1, vs {}, a {} in l2".format(item1.name, self._describe(item1), item2.name, self._describe(item2)))
+                    print("Items do not match in {} list: {}, a {} in l1, vs {}, a {} in l2".format(name, item1.name, self._describe(item1), item2.name, self._describe(item2)))
                     return False
+                if isinstance(item1, remote_execution_pb2.FileNode):
+                    if item1.is_executable != item2.is_executable:
+                        print("Executable flags do not match on file {}.".format(item1.name))
+                        return False
                 index += 1
             if index != len(l2):
                 print("l2 is long: Has extra items {}".format(l2[index:]))
@@ -756,17 +787,19 @@ class CasBasedDirectory(Directory):
             return True
 
         def compare_pb2_directories(d1, d2):
-            result = (compare_list(d1.directories, d2.directories)
-                    and compare_list(d1.symlinks, d2.symlinks)
-                    and compare_list(d1.files, d2.files))
+            result = (compare_list(d1.directories, d2.directories, "directory")
+                    and compare_list(d1.symlinks, d2.symlinks, "symlink")
+                    and compare_list(d1.files, d2.files, "file"))
             return result
                         
         if not compare_pb2_directories(self.pb2_directory, other.pb2_directory):
             return False
 
         for d in self.pb2_directory.directories:
-            self.index[d.name].buildstream_object.showdiff(other.index[d.name].buildstream_object)
+            if not self.index[d.name].buildstream_object.showdiff(other.index[d.name].buildstream_object):
+                return False
         print("No differences found in {}".format(self))
+        return True
               
     def show_files_recursive(self):
         elems = []
@@ -830,7 +863,8 @@ class CasBasedDirectory(Directory):
             with tempfile.TemporaryDirectory(prefix="roundtrip") as tmpdir:
                 external_pathspec.export_files(tmpdir)
                 if files is None:
-                    files = list_relative_paths(tmpdir)
+                    files = list(list_relative_paths(tmpdir))
+                print("Importing from filesystem: filelist is: {}".format(files))
                 duplicate_cas._import_files_from_directory(tmpdir, files=files)
                 duplicate_cas._recalculate_recursing_down()
                 if duplicate_cas.parent:
diff --git a/tests/storage/virtual_directory_import.py b/tests/storage/virtual_directory_import.py
index 24ef2e3..4754800 100644
--- a/tests/storage/virtual_directory_import.py
+++ b/tests/storage/virtual_directory_import.py
@@ -24,7 +24,7 @@ root_filesets = [
     [('a/b/c/textfile1', 'F', 'This is the replacement textfile 1\n')],
     [('a/b/d', 'D', '')],
     [('a/b/c', 'S', '/a/b/d')],
-    [('a/b/d', 'D', ''), ('a/b/c', 'S', '/a/b/d')],
+    [('a/b/d', 'D', ''), ('a/b/c', 'S', '/a/b/d')]
 ]
 
 empty_hash_ref = "e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855"
@@ -52,7 +52,7 @@ def generate_import_roots(directory):
 
 def generate_random_roots(directory):
     random.seed(RANDOM_SEED)
-    for rootno in range(6,13):
+    for rootno in range(6,21):
         rootname = "root{}".format(rootno)
         rootdir = os.path.join(directory, "content", rootname)
         things = []
@@ -63,6 +63,7 @@ def generate_random_roots(directory):
             thingname = "node{}".format(i)
             thing = random.choice(['dir', 'link', 'file'])
             target = os.path.join(rootdir, location, thingname)
+            description = thing
             if thing == 'dir':
                 os.makedirs(target)
                 locations.append(os.path.join(location, thingname))
@@ -73,10 +74,13 @@ def generate_random_roots(directory):
                 # TODO: Make some relative symlinks
                 if random.randint(1, 3) == 1 or len(things) == 0:
                     os.symlink("/broken", target)
+                    description = "symlink pointing to /broken"
                 else:
-                    os.symlink(random.choice(things), target)
+                    symlink_destination = random.choice(things)
+                    os.symlink(symlink_destination, target)
+                    description = "symlink pointing to {}".format(symlink_destination)
             things.append(os.path.join(location, thingname))
-            print("Generated {}/{} ".format(rootdir, things[-1]))
+            print("Generated {}/{}, a {}".format(rootdir, things[-1], description))
 
 
 def file_contents(path):
@@ -143,7 +147,7 @@ def directory_not_empty(path):
     return os.listdir(path)
 
 
-@pytest.mark.parametrize("original,overlay", combinations([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]))
+@pytest.mark.parametrize("original,overlay", combinations(range(1,21)))
 def test_cas_import(cli, tmpdir, original, overlay):
     fake_context = FakeContext()
     fake_context.artifactdir = tmpdir


[buildstream] 43/43: Restructure of .import()

Posted by gi...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a commit to branch jmac/cas_to_cas_oct_v2
in repository https://gitbox.apache.org/repos/asf/buildstream.git

commit 68e105460471ad3136f9066480d8990125251b65
Author: Jim MacArthur <ji...@codethink.co.uk>
AuthorDate: Tue Oct 30 12:34:18 2018 +0000

    Restructure of .import()
---
 buildstream/storage/_casbaseddirectory.py | 31 +++++++++++++++----------------
 1 file changed, 15 insertions(+), 16 deletions(-)

diff --git a/buildstream/storage/_casbaseddirectory.py b/buildstream/storage/_casbaseddirectory.py
index 5fc4099..ad30307 100644
--- a/buildstream/storage/_casbaseddirectory.py
+++ b/buildstream/storage/_casbaseddirectory.py
@@ -559,8 +559,7 @@ class CasBasedDirectory(Directory):
         """ A full import is significantly quicker than a partial import, because we can just
         replace one directory with another's hash, without doing any recursion.
         """
-        if files is None:
-            files = source_directory.list_relative_paths()
+
         # You must pass a list into _partial_import (not a generator)
         return self._partial_import_cas_into_cas(source_directory, list(files))
 
@@ -650,27 +649,27 @@ class CasBasedDirectory(Directory):
         can_link (bool): Ignored, since hard links do not have any meaning within CAS.
         """
 
-        print("Directory before import: {}".format(self.show_files_recursive()))
-
-        if isinstance(external_pathspec, CasBasedDirectory):
-            print("-"*80 + "Performing direct CAS-to-CAS import")
-            result = self._import_cas_into_cas(external_pathspec, files=files)
-            print("Result of cas-to-cas import: {}".format(self.show_files_recursive()))
-        else:
-            print("-"*80 + "Performing initial import")
-            if isinstance(external_pathspec, FileBasedDirectory):
-                source_directory = external_pathspec.get_underlying_directory()
-            else:
-                source_directory = external_pathspec
-            if files is None:
+        if files is None:
+            if isinstance(external_pathspec, str):
                 files = list_relative_paths(external_pathspec)
+            else:
+                assert isinstance(external_pathspec, Directory)
+                files = external_pathspec.list_relative_paths()
+
+        if isinstance(external_pathspec, FileBasedDirectory):
+            source_directory = external_pathspec.get_underlying_directory()
+            result = self._import_files_from_directory(source_directory, files=files)
+        elif isinstance(external_pathspec, str):
+            source_directory = external_pathspec
             result = self._import_files_from_directory(source_directory, files=files)
+        else:
+            assert isinstance(external_pathspec, CasBasedDirectory)
+            result = self._import_cas_into_cas(external_pathspec, files=files)
 
         # TODO: No notice is taken of report_written, update_utimes or can_link.
         # Current behaviour is to fully populate the report, which is inefficient,
         # but still correct.
 
-
         # We need to recalculate and store the hashes of all directories both
         # up and down the tree; we have changed our directory by importing files
         # which changes our hash and all our parents' hashes of us. The trees


[buildstream] 22/43: hack: remove files which previously blocked directory creation

Posted by gi...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a commit to branch jmac/cas_to_cas_oct_v2
in repository https://gitbox.apache.org/repos/asf/buildstream.git

commit 14623feb9c455754853b39c218b19233a31dd57a
Author: Jim MacArthur <ji...@codethink.co.uk>
AuthorDate: Thu Oct 25 15:12:05 2018 +0100

    hack: remove files which previously blocked directory creation
---
 buildstream/storage/_casbaseddirectory.py | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/buildstream/storage/_casbaseddirectory.py b/buildstream/storage/_casbaseddirectory.py
index d00cdd6..3388cfc 100644
--- a/buildstream/storage/_casbaseddirectory.py
+++ b/buildstream/storage/_casbaseddirectory.py
@@ -454,7 +454,10 @@ class CasBasedDirectory(Directory):
                             directory = directory.descend(c, create=True)
                         elif components:
                             # Oh dear. We have components left to resolve, but the one we're trying to resolve points to a file.
-                            raise VirtualDirectoryError("Reached a file called {} while trying to resolve a symlink; cannot proceed".format(c))
+                            print("Trying to resolve {}, but found {} was a file.".format(symlink.target, c))
+                            self.delete_entry(c)
+                            directory = directory.descend(c, create=True)
+                            #raise VirtualDirectoryError("Reached a file called {} while trying to resolve a symlink; cannot proceed".format(c))
                         else:
                             return f
                 else:


[buildstream] 39/43: Rename _add_new_link and remove duplicated code

Posted by gi...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a commit to branch jmac/cas_to_cas_oct_v2
in repository https://gitbox.apache.org/repos/asf/buildstream.git

commit 3c7049672c58d3b9e1221b13269271bb09bb3775
Author: Jim MacArthur <ji...@codethink.co.uk>
AuthorDate: Tue Oct 30 11:17:41 2018 +0000

    Rename _add_new_link and remove duplicated code
---
 buildstream/storage/_casbaseddirectory.py | 14 +++-----------
 1 file changed, 3 insertions(+), 11 deletions(-)

diff --git a/buildstream/storage/_casbaseddirectory.py b/buildstream/storage/_casbaseddirectory.py
index 40c506d..5b81698 100644
--- a/buildstream/storage/_casbaseddirectory.py
+++ b/buildstream/storage/_casbaseddirectory.py
@@ -180,16 +180,8 @@ class CasBasedDirectory(Directory):
         filenode.is_executable = is_executable
         self.index[filename] = IndexEntry(filenode, modified=(filename in self.index))
 
-    def _add_new_link(self, basename, filename):
-        existing_link = self._find_pb2_entry(filename)
-        if existing_link:
-            symlinknode = existing_link
-        else:
-            symlinknode = self.pb2_directory.symlinks.add()
-        symlinknode.name = filename
-        # A symlink node has no digest.
-        symlinknode.target = os.readlink(os.path.join(basename, filename))
-        self.index[filename] = IndexEntry(symlinknode, modified=(existing_link is not None))
+    def _copy_link_from_filesystem(self, basename, filename):
+        self._add_new_link_direct(filename, os.readlink(os.path.join(basename, filename)))
 
     def _add_new_link_direct(self, name, target):
         existing_link = self._find_pb2_entry(name)
@@ -462,7 +454,7 @@ class CasBasedDirectory(Directory):
                 result.combine(subdir_result)
             elif os.path.islink(import_file):
                 if self._check_replacement(entry, path_prefix, result):
-                    self._add_new_link(source_directory, entry)
+                    self._copy_link_from_filesystem(source_directory, entry)
                     result.files_written.append(relative_pathname)
             elif os.path.isdir(import_file):
                 # A plain directory which already exists isn't a problem; just ignore it.


[buildstream] 23/43: Detect infinite symlink loops in resolve()

Posted by gi...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a commit to branch jmac/cas_to_cas_oct_v2
in repository https://gitbox.apache.org/repos/asf/buildstream.git

commit 5fc7d6b407edc3a4335144e4658f5e52a5ec13cd
Author: Jim MacArthur <ji...@codethink.co.uk>
AuthorDate: Thu Oct 25 16:48:07 2018 +0100

    Detect infinite symlink loops in resolve()
---
 buildstream/storage/_casbaseddirectory.py | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/buildstream/storage/_casbaseddirectory.py b/buildstream/storage/_casbaseddirectory.py
index 3388cfc..ce0ec2b 100644
--- a/buildstream/storage/_casbaseddirectory.py
+++ b/buildstream/storage/_casbaseddirectory.py
@@ -388,7 +388,7 @@ class CasBasedDirectory(Directory):
         return directory
 
     
-    def _resolve(self, name, absolute_symlinks_resolve=True, force_create=False):
+    def _resolve(self, name, absolute_symlinks_resolve=True, force_create=False, first_seen_object = None):
         """ Resolves any name to an object. If the name points to a symlink in
         this directory, it returns the thing it points to,
         recursively. Returns a CasBasedDirectory, FileNode or
@@ -407,6 +407,14 @@ class CasBasedDirectory(Directory):
             return index_entry.pb_object
         
         assert isinstance(index_entry.pb_object, remote_execution_pb2.SymlinkNode)
+
+        if first_seen_object is None:
+            first_seen_object = index_entry.pb_object
+        else:
+            if index_entry.pb_object == first_seen_object:
+                ### Infinite symlink loop detected ###
+                return None
+        
         print("Resolving '{}': This is a symlink node in the current directory.".format(name))
         symlink = index_entry.pb_object
         components = symlink.target.split(CasBasedDirectory._pb2_path_sep)
@@ -440,7 +448,7 @@ class CasBasedDirectory(Directory):
                     directory = directory.parent
             else:
                 if c in directory.index:
-                    f = directory._resolve(c, absolute_symlinks_resolve)
+                    f = directory._resolve(c, absolute_symlinks_resolve, first_seen_object=first_seen_object)
                     # Ultimately f must now be a file or directory
                     if isinstance(f, CasBasedDirectory):
                         directory = f


[buildstream] 26/43: CasBasedDirectory: Remove some prints

Posted by gi...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a commit to branch jmac/cas_to_cas_oct_v2
in repository https://gitbox.apache.org/repos/asf/buildstream.git

commit b89b7866d3bdd0a4120401dbafc191f7aebcf616
Author: Jim MacArthur <ji...@codethink.co.uk>
AuthorDate: Thu Oct 25 17:25:51 2018 +0100

    CasBasedDirectory: Remove some prints
---
 buildstream/storage/_casbaseddirectory.py | 14 --------------
 1 file changed, 14 deletions(-)

diff --git a/buildstream/storage/_casbaseddirectory.py b/buildstream/storage/_casbaseddirectory.py
index f1799b6..6a25602 100644
--- a/buildstream/storage/_casbaseddirectory.py
+++ b/buildstream/storage/_casbaseddirectory.py
@@ -245,10 +245,8 @@ class CasBasedDirectory(Directory):
         for collection in [self.pb2_directory.files, self.pb2_directory.symlinks, self.pb2_directory.directories]:
             for thing in collection:
                 if thing.name == name:
-                    print("Removing {} from PB2".format(name))
                     collection.remove(thing)
         if name in self.index:
-            print("Removing {} from index".format(name))
             del self.index[name]
 
     def descend(self, subdirectory_spec, create=False):
@@ -535,9 +533,6 @@ class CasBasedDirectory(Directory):
         """ Imports files from a traditional directory """
         result = FileListResult()
         for entry in files:
-            print("Importing {} from file system".format(entry))
-            print("...Order of elements was {}".format(", ".join(self.index.keys())))
-
             split_path = entry.split(os.path.sep)
             # The actual file on the FS we're importing
             import_file = os.path.join(source_directory, entry)
@@ -548,7 +543,6 @@ class CasBasedDirectory(Directory):
                 # Hand this off to the importer for that subdir. This will only do one file -
                 # a better way would be to hand off all the files in this subdir at once.
                 # failed here because directory_name didn't point to a directory...
-                print("Attempting to import into {} from {}".format(directory_name, source_directory))
                 subdir_result = self._import_directory_recursively(directory_name, source_directory,
                                                                    split_path[1:], path_prefix)
                 result.combine(subdir_result)
@@ -564,8 +558,6 @@ class CasBasedDirectory(Directory):
                 if self._check_replacement(entry, path_prefix, result):
                     self._add_new_file(source_directory, entry)
                     result.files_written.append(relative_pathname)
-            print("...Order of elements is now {}".format(", ".join(self.index.keys())))
-
         return result
 
 
@@ -694,23 +686,17 @@ class CasBasedDirectory(Directory):
             else:
                 # We're importing a file or symlink - replace anything with the same name.
                 print("Import of file/symlink {} into this directory. Removing anything existing...".format(f))
-                print("   ... ordering of nodes in this dir was: {}".format(self.index.keys()))
-                print("   ... symlinks were {}".format([x.name for x in self.pb2_directory.symlinks]))
                 importable = self._check_replacement(f, path_prefix, result)
                 if importable:
                     print("   ... after replacement of '{}', symlinks are now {}".format(f, [x.name for x in self.pb2_directory.symlinks]))
                     item = source_directory.index[f].pb_object
                     if isinstance(item, remote_execution_pb2.FileNode):
-                        print("   ... importing file")
                         filenode = self.pb2_directory.files.add(digest=item.digest, name=f,
                                                                 is_executable=item.is_executable)
                         self.index[f] = IndexEntry(filenode, modified=(fullname in result.overwritten))
                     else:
-                        print("   ... importing symlink")
                         assert(isinstance(item, remote_execution_pb2.SymlinkNode))
                         self._add_new_link_direct(name=f, target=item.target)
-                        print("   ... symlinks are now {}".format([x.name for x in self.pb2_directory.symlinks]))
-                    print("   ... ordering of nodes in this dir is now: {}".format(self.index.keys()))
         return result
 
     def transfer_node_contents(destination, source):


[buildstream] 40/43: Remove some prints and improve comments

Posted by gi...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a commit to branch jmac/cas_to_cas_oct_v2
in repository https://gitbox.apache.org/repos/asf/buildstream.git

commit 545219dc9ea1212b2db53efdae6e76ac5d4fbfb8
Author: Jim MacArthur <ji...@codethink.co.uk>
AuthorDate: Tue Oct 30 11:48:44 2018 +0000

    Remove some prints and improve comments
---
 buildstream/storage/_casbaseddirectory.py | 30 +++++++++++++++++-------------
 1 file changed, 17 insertions(+), 13 deletions(-)

diff --git a/buildstream/storage/_casbaseddirectory.py b/buildstream/storage/_casbaseddirectory.py
index 5b81698..a868c52 100644
--- a/buildstream/storage/_casbaseddirectory.py
+++ b/buildstream/storage/_casbaseddirectory.py
@@ -195,7 +195,6 @@ class CasBasedDirectory(Directory):
         symlinknode.target = target
         self.index[name] = IndexEntry(symlinknode, modified=(existing_link is not None))
 
-        
     def delete_entry(self, name):
         for collection in [self.pb2_directory.files, self.pb2_directory.symlinks, self.pb2_directory.directories]:
             for thing in collection:
@@ -252,9 +251,6 @@ class CasBasedDirectory(Directory):
         else:
             if create:
                 newdir = self._add_directory(subdirectory_spec[0])
-                print("Created new directory called {} and descending into it".format(subdirectory_spec[0]))
-                #if subdirectory_spec[0] == "broken":
-                #    assert False
                 return newdir.descend(subdirectory_spec[1:], create)
             else:
                 error = "No entry called '{}' found in {}. There are directories called {}."
@@ -293,23 +289,31 @@ class CasBasedDirectory(Directory):
         if isinstance(self.index[name].buildstream_object, Directory):
             return True
         target = self._resolve(name)
-        print("Is {} followable? Resolved to {}".format(name, target))
-        return isinstance(target, CasBasedDirectory) or target is None
+        return isinstance(target, CasBasedDirectory) or target is None  #  TODO: But why return True if it's None (broken link/circular loop)? Surely that is against the docstring.
 
     def _resolve(self, name, absolute_symlinks_resolve=True, force_create=False, first_seen_object = None):
         """ Resolves any name to an object. If the name points to a symlink in
         this directory, it returns the thing it points to,
-        recursively. Returns a CasBasedDirectory, FileNode or
-        None.
+        recursively.
+
+        Returns a CasBasedDirectory, FileNode or
+        None. None indicates any of these cases:
+        * 'name' does not exist in this directory
+        * 'name' is a broken symlink,
+        * 'name' points to an infinite symlink loop.
+        * 'name' points to an absolute symlink and absolute_symlinks_resolve is False.
 
         If force_create is on, will attempt to create directories to make symlinks and directories resolve.
         If force_create is off, this will never alter this directory.
 
         """
-        # First check if it's a normal object and return that
 
+        # TODO: first_seen_object isn't sufficient. We could get into a loop after following one link and not detect it.
+        # TODO: 'None' is overloaded, maybe we should use exceptions and leave 'None' for actual nonexistent things.
         if name not in self.index:
             return None
+
+        # First check if it's a normal object and return that
         index_entry = self.index[name]
         if isinstance(index_entry.buildstream_object, Directory):
             return index_entry.buildstream_object
@@ -325,7 +329,6 @@ class CasBasedDirectory(Directory):
                 ### Infinite symlink loop detected ###
                 return None
         
-        print("Resolving '{}': This is a symlink node in the current directory.".format(name))
         symlink = index_entry.pb_object
         components = symlink.target.split(CasBasedDirectory._pb2_path_sep)
 
@@ -336,12 +339,12 @@ class CasBasedDirectory(Directory):
                 # Discard the first empty element
                 components.pop(0)
             else:
-                print("  _resolve: Absolute symlink, which we won't resolve.")
+                # Unresolvable absolute symlink
                 return None
         else:
             start_directory = self
+
         directory = start_directory
-        print("Resolve {}: starting from {}".format(symlink.target, start_directory))
         while True:
             if not components:
                 # We ran out of path elements and ended up in a directory
@@ -357,8 +360,9 @@ class CasBasedDirectory(Directory):
                 # returns the root.                
             else:
                 if c in directory.index:
+                    # Recursive resolve and continue
                     f = directory._resolve(c, absolute_symlinks_resolve, first_seen_object=first_seen_object)
-                    # Ultimately f must now be a file or directory
+
                     if isinstance(f, CasBasedDirectory):
                         directory = f
 


[buildstream] 09/43: virtual_directory_import.py: Add random test generator

Posted by gi...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a commit to branch jmac/cas_to_cas_oct_v2
in repository https://gitbox.apache.org/repos/asf/buildstream.git

commit ef7c4cdbe7b686eabdacec59950e088091c5fe4c
Author: Jim MacArthur <ji...@codethink.co.uk>
AuthorDate: Fri Oct 5 13:43:49 2018 +0100

    virtual_directory_import.py: Add random test generator
---
 tests/storage/virtual_directory_import.py | 37 ++++++++++++++++++++++++++++---
 1 file changed, 34 insertions(+), 3 deletions(-)

diff --git a/tests/storage/virtual_directory_import.py b/tests/storage/virtual_directory_import.py
index b76fef7..2e5163c 100644
--- a/tests/storage/virtual_directory_import.py
+++ b/tests/storage/virtual_directory_import.py
@@ -1,5 +1,6 @@
 import os
 import pytest
+import random
 from tests.testutils import cli
 
 from buildstream.storage import CasBasedDirectory
@@ -27,6 +28,7 @@ root_filesets = [
 ]
 
 empty_hash_ref = "e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855"
+RANDOM_SEED = 69105
 
 
 def generate_import_roots(directory):
@@ -48,6 +50,33 @@ def generate_import_roots(directory):
                 os.symlink(content, os.path.join(rootdir, path))
 
 
+def generate_random_root(directory):
+    random.seed(RANDOM_SEED)
+    rootname = "root6"
+    rootdir = os.path.join(directory, "content", rootname)
+    things = []
+    locations = ['.']
+    for i in range(0, 100):
+        location = random.choice(locations)
+        thingname = "node{}".format(i)
+        thing = random.choice(['dir', 'link', 'file'])
+        target = os.path.join(rootdir, location, thingname)
+        if thing == 'dir':
+            os.makedirs(target)
+            locations.append(os.path.join(location, thingname))
+        elif thing == 'file':
+            with open(target, "wt") as f:
+                f.write("This is node {}\n".format(i))
+        elif thing == 'link':
+            # TODO: Make some relative symlinks
+            if random.randint(1, 3) == 1 or len(things) == 0:
+                os.symlink("/broken", target)
+            else:
+                os.symlink(random.choice(things), target)
+        things.append(os.path.join(location, thingname))
+        print("Generated {}/{} ".format(rootdir, things[-1]))
+
+
 def file_contents(path):
     with open(path, "r") as f:
         result = f.read()
@@ -64,6 +93,7 @@ def create_new_casdir(root_number, fake_context, tmpdir):
     assert d.ref.hash != empty_hash_ref
     return d
 
+
 def create_new_filedir(root_number, tmpdir):
     root = os.path.join(tmpdir, "vdir")
     os.makedirs(root)
@@ -117,7 +147,7 @@ def test_cas_import(cli, tmpdir, original, overlay):
     fake_context.artifactdir = tmpdir
     # Create some fake content
     generate_import_roots(tmpdir)
-
+    generate_random_root(tmpdir)
     d = create_new_casdir(original, fake_context, tmpdir)
     d2 = create_new_casdir(overlay, fake_context, tmpdir)
     d.import_files(d2)
@@ -145,12 +175,13 @@ def test_cas_import(cli, tmpdir, original, overlay):
             assert os.path.isdir(realpath)
 
 
-@pytest.mark.parametrize("root", [1, 2, 3, 4, 5])
+@pytest.mark.parametrize("root", [1, 2, 3, 4, 5, 6])
 def test_directory_listing(cli, tmpdir, root):
     fake_context = FakeContext()
     fake_context.artifactdir = tmpdir
     # Create some fake content
     generate_import_roots(tmpdir)
+    generate_random_root(tmpdir)
 
     d = create_new_filedir(root, tmpdir)
     filelist = list(d.list_relative_paths())
@@ -162,4 +193,4 @@ def test_directory_listing(cli, tmpdir, root):
     print("{}".format(filelist))
     print("filelist for root {} via CasBasedDirectory:".format(root))
     print("{}".format(filelist2))
-    assert(filelist==filelist2)
+    assert filelist == filelist2


[buildstream] 31/43: Rearrange comment

Posted by gi...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a commit to branch jmac/cas_to_cas_oct_v2
in repository https://gitbox.apache.org/repos/asf/buildstream.git

commit e5e5be9dccb1c0ed2f6ef919a2fe2218edadf8ff
Author: Jim MacArthur <ji...@codethink.co.uk>
AuthorDate: Fri Oct 26 14:11:37 2018 +0100

    Rearrange comment
---
 buildstream/storage/_casbaseddirectory.py | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/buildstream/storage/_casbaseddirectory.py b/buildstream/storage/_casbaseddirectory.py
index c5768a8..a7d656d 100644
--- a/buildstream/storage/_casbaseddirectory.py
+++ b/buildstream/storage/_casbaseddirectory.py
@@ -400,11 +400,11 @@ class CasBasedDirectory(Directory):
             if c == ".":
                 pass
             elif c == "..":
+                if directory.parent is not None:
+                    directory = directory.parent
                 # If directory.parent *is* None, this is an attempt to access
                 # '..' from the root, which is valid under POSIX; it just
                 # returns the root.                
-                if directory.parent is not None:
-                    directory = directory.parent
             else:
                 if c in directory.index:
                     f = directory._resolve(c, absolute_symlinks_resolve, first_seen_object=first_seen_object)


[buildstream] 25/43: virtual_directory_test.py: More fixed examples and better test names

Posted by gi...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a commit to branch jmac/cas_to_cas_oct_v2
in repository https://gitbox.apache.org/repos/asf/buildstream.git

commit fa98975e1594d7802ecd9b83033016f0d0735a51
Author: Jim MacArthur <ji...@codethink.co.uk>
AuthorDate: Thu Oct 25 16:48:50 2018 +0100

    virtual_directory_test.py: More fixed examples and better test names
---
 tests/storage/virtual_directory_import.py | 18 +++++++++++++-----
 1 file changed, 13 insertions(+), 5 deletions(-)

diff --git a/tests/storage/virtual_directory_import.py b/tests/storage/virtual_directory_import.py
index dfe3580..9207193 100644
--- a/tests/storage/virtual_directory_import.py
+++ b/tests/storage/virtual_directory_import.py
@@ -20,11 +20,17 @@ class FakeContext():
 # 'F' (file), 'S' (symlink) or 'D' (directory) with content being the contents
 # for a file or the destination for a symlink.
 root_filesets = [
+    # Arbitrary test sets
     [('a/b/c/textfile1', 'F', 'This is textfile 1\n')],
     [('a/b/c/textfile1', 'F', 'This is the replacement textfile 1\n')],
     [('a/b/d', 'D', '')],
     [('a/b/c', 'S', '/a/b/d')],
-    [('a/b/d', 'D', ''), ('a/b/c', 'S', '/a/b/d')]
+    [('a/b/d', 'S', '/a/b/c')],
+    [('a/b/d', 'D', ''), ('a/b/c', 'S', '/a/b/d')], 
+    [('a/b/c', 'D', ''), ('a/b/d', 'S', '/a/b/c')], 
+    [('a/b', 'F', 'This is textfile 1\n')],
+    [('a/b/c', 'F', 'This is textfile 1\n')],
+    [('a/b/c', 'D', '')]
 ]
 
 empty_hash_ref = "e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855"
@@ -178,8 +184,9 @@ def _import_test(tmpdir, original, overlay, generator_function, verify_contents=
                     assert os.path.islink(realpath)
                     assert os.readlink(realpath) == content
             elif typename == 'D':
-                # Note that isdir accepts symlinks to dirs, so a symlink to a dir is acceptable.
-                assert os.path.isdir(realpath)
+                # We can't do any more tests than this because it depends on things present in the original. Blank directories
+                # here will be ignored and the original left in place.
+                assert os.path.lexists(realpath)
 
     # Now do the same thing with filebaseddirectories and check the contents match
     d3 = create_new_casdir(original, fake_context, tmpdir)
@@ -187,14 +194,15 @@ def _import_test(tmpdir, original, overlay, generator_function, verify_contents=
     d3.import_files(d2)
     assert d.ref.hash == d3.ref.hash
 
-@pytest.mark.parametrize("original,overlay", combinations(range(1,6)))
+@pytest.mark.parametrize("original,overlay", combinations(range(1,len(root_filesets)+1)))
 def test_fixed_cas_import(cli, tmpdir, original, overlay):
     _import_test(tmpdir, original, overlay, generate_import_roots, verify_contents=True)
 
 @pytest.mark.parametrize("original,overlay", combinations(range(1,11)))
-def test_random_cas_import(cli, tmpdir, original, overlay):
+def test_random_cas_import_fast(cli, tmpdir, original, overlay):
     _import_test(tmpdir, original, overlay, generate_random_root, verify_contents=False)
 
+    
 def _listing_test(tmpdir, root, generator_function):
     fake_context = FakeContext()
     fake_context.artifactdir = tmpdir


[buildstream] 06/43: _casbaseddirectory: Optionally resolve absolute symlinks

Posted by gi...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a commit to branch jmac/cas_to_cas_oct_v2
in repository https://gitbox.apache.org/repos/asf/buildstream.git

commit 0e6901c8e74d6d2da7d2a1f03261f2379c444777
Author: Jim MacArthur <ji...@codethink.co.uk>
AuthorDate: Wed Oct 3 11:37:51 2018 +0100

    _casbaseddirectory: Optionally resolve absolute symlinks
---
 buildstream/storage/_casbaseddirectory.py | 18 ++++++++++++------
 1 file changed, 12 insertions(+), 6 deletions(-)

diff --git a/buildstream/storage/_casbaseddirectory.py b/buildstream/storage/_casbaseddirectory.py
index 388b8ec..ef7bb68 100644
--- a/buildstream/storage/_casbaseddirectory.py
+++ b/buildstream/storage/_casbaseddirectory.py
@@ -38,6 +38,8 @@ from .._exceptions import BstError
 from .directory import Directory, VirtualDirectoryError
 from ._filebaseddirectory import FileBasedDirectory
 from ..utils import FileListResult, safe_copy, list_relative_paths
+from ..utils import FileListResult, safe_copy, list_relative_paths, _relative_symlink_target
+from .._artifactcache.cascache import CASCache
 
 
 class IndexEntry():
@@ -285,7 +287,7 @@ class CasBasedDirectory(Directory):
                 directory = directory.descend(c, create=True)
         return directory
 
-    def _resolve(self, name):
+    def _resolve(self, name, absolute_symlinks_resolve=True):
         """ Resolves any name to an object. If the name points to a symlink in this 
         directory, it returns the thing it points to, recursively. Returns a CasBasedDirectory, FileNode or None. Never creates a directory or otherwise alters the directory. """
         # First check if it's a normal object and return that
@@ -304,9 +306,13 @@ class CasBasedDirectory(Directory):
 
         absolute = symlink.target.startswith(CasBasedDirectory._pb2_absolute_path_prefix)
         if absolute:
-            start_directory = self.find_root()
-            # Discard the first empty element
-            components.pop(0)
+            if absolute_symlinks_resolve:
+                start_directory = self.find_root()
+                # Discard the first empty element
+                components.pop(0)
+            else:
+                print("  _resolve: Absolute symlink, which we won't resolve.")
+                return None
         else:
             start_directory = self
         directory = start_directory
@@ -325,7 +331,7 @@ class CasBasedDirectory(Directory):
                     directory = directory.parent
             else:
                 if c in directory.index:
-                    f = directory._resolve(c)
+                    f = directory._resolve(c, absolute_symlinks_resolve)
                     # Ultimately f must now be a file or directory
                     if isinstance(f, CasBasedDirectory):
                         directory = f
@@ -608,7 +614,7 @@ class CasBasedDirectory(Directory):
         print("Running list_relative_paths on relpath {}. files={}, symlinks={}".format(relpath, [f[0] for f in file_list], [s[0] for s in symlink_list]))
 
         for (k, v) in sorted(symlink_list):
-            target = self._resolve(k)
+            target = self._resolve(k, absolute_symlinks_resolve=True)
             if isinstance(target, CasBasedDirectory):
                 print("Adding the resolved symlink {} which resolves to {} to our directory list".format(k, target))
                 directory_list.append((k,IndexEntry(k, buildstream_object=target)))


[buildstream] 14/43: Fix 'remove_item'->delete_entry

Posted by gi...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a commit to branch jmac/cas_to_cas_oct_v2
in repository https://gitbox.apache.org/repos/asf/buildstream.git

commit 6d16d70c418ef70bb60456408123008c293e037f
Author: Jim MacArthur <ji...@codethink.co.uk>
AuthorDate: Tue Oct 23 17:56:12 2018 +0100

    Fix 'remove_item'->delete_entry
---
 buildstream/storage/_casbaseddirectory.py | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/buildstream/storage/_casbaseddirectory.py b/buildstream/storage/_casbaseddirectory.py
index f7ef35b..a3bed3e 100644
--- a/buildstream/storage/_casbaseddirectory.py
+++ b/buildstream/storage/_casbaseddirectory.py
@@ -166,13 +166,13 @@ class CasBasedDirectory(Directory):
         existing_item = self._find_pb2_entry(name)
         if isinstance(existing_item, remote_execution_pb2.FileNode):
             # Directory imported over file with same name
-            self.remove_item(name)
+            self.delete_entry(name)
         elif isinstance(existing_item, remote_execution_pb2.SymlinkNode):
             # Directory imported over symlink with same source name
             if self.symlink_target_is_directory(existing_item):
                 return self._resolve_symlink_or_directory(name) # That's fine; any files in the source directory should end up at the target of the symlink.
             else:
-                self.remove_item(name) # Symlinks to files get replaced
+                self.delete_entry(name) # Symlinks to files get replaced
         return self.descend(name, create=True) # Creates the directory if it doesn't already exist.
 
 


[buildstream] 10/43: virtual_directory_import.py: Check import via a file-based directory

Posted by gi...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a commit to branch jmac/cas_to_cas_oct_v2
in repository https://gitbox.apache.org/repos/asf/buildstream.git

commit c996eb7606895caccec558965e8b617b805a6e7d
Author: Jim MacArthur <ji...@codethink.co.uk>
AuthorDate: Fri Oct 5 13:59:34 2018 +0100

    virtual_directory_import.py: Check import via a file-based directory
---
 tests/storage/virtual_directory_import.py | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/tests/storage/virtual_directory_import.py b/tests/storage/virtual_directory_import.py
index 2e5163c..1c78c1b 100644
--- a/tests/storage/virtual_directory_import.py
+++ b/tests/storage/virtual_directory_import.py
@@ -174,6 +174,12 @@ def test_cas_import(cli, tmpdir, original, overlay):
             # Note that isdir accepts symlinks to dirs, so a symlink to a dir is acceptable.
             assert os.path.isdir(realpath)
 
+    # Now do the same thing with filebaseddirectories and check the contents match
+    d3 = create_new_casdir(original, fake_context, tmpdir)
+    d4 = create_new_filedir(overlay, tmpdir)
+    d3.import_files(d2)
+    assert d.ref.hash == d3.ref.hash
+
 
 @pytest.mark.parametrize("root", [1, 2, 3, 4, 5, 6])
 def test_directory_listing(cli, tmpdir, root):


[buildstream] 05/43: _casbaseddirectory: Corrections to list_relative_files, adds _resolve

Posted by gi...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a commit to branch jmac/cas_to_cas_oct_v2
in repository https://gitbox.apache.org/repos/asf/buildstream.git

commit 64763b3c4d96aaeff8e14b7fdcba73a51e0f4b03
Author: Jim MacArthur <ji...@codethink.co.uk>
AuthorDate: Tue Oct 2 16:15:45 2018 +0100

    _casbaseddirectory: Corrections to list_relative_files, adds _resolve
---
 buildstream/storage/_casbaseddirectory.py | 85 +++++++++++++++++++++++++++----
 1 file changed, 76 insertions(+), 9 deletions(-)

diff --git a/buildstream/storage/_casbaseddirectory.py b/buildstream/storage/_casbaseddirectory.py
index e5c83c5..388b8ec 100644
--- a/buildstream/storage/_casbaseddirectory.py
+++ b/buildstream/storage/_casbaseddirectory.py
@@ -285,6 +285,67 @@ class CasBasedDirectory(Directory):
                 directory = directory.descend(c, create=True)
         return directory
 
+    def _resolve(self, name):
+        """ Resolves any name to an object. If the name points to a symlink in this 
+        directory, it returns the thing it points to, recursively. Returns a CasBasedDirectory, FileNode or None. Never creates a directory or otherwise alters the directory. """
+        # First check if it's a normal object and return that
+
+        if name not in self.index:
+            return None
+        index_entry = self.index[name]
+        if isinstance(index_entry.buildstream_object, Directory):
+            return index_entry.buildstream_object
+        elif isinstance(index_entry.pb_object, remote_execution_pb2.FileNode):
+            return index_entry.pb_object
+        
+        assert isinstance(index_entry.pb_object, remote_execution_pb2.SymlinkNode)
+        symlink = index_entry.pb_object
+        components = symlink.target.split(CasBasedDirectory._pb2_path_sep)
+
+        absolute = symlink.target.startswith(CasBasedDirectory._pb2_absolute_path_prefix)
+        if absolute:
+            start_directory = self.find_root()
+            # Discard the first empty element
+            components.pop(0)
+        else:
+            start_directory = self
+        directory = start_directory
+        print("Resolve {}: starting from {}".format(symlink.target, start_directory))
+        while True:
+            if not components:
+                # We ran out of path elements and ended up in a directory
+                return directory
+            c = components.pop(0)
+            if c == "..":
+                print("  resolving {}: up-dir".format(c))
+                # If directory.parent *is* None, this is an attempt to access
+                # '..' from the root, which is valid under POSIX; it just
+                # returns the root.                
+                if directory.parent is not None:
+                    directory = directory.parent
+            else:
+                if c in directory.index:
+                    f = directory._resolve(c)
+                    # Ultimately f must now be a file or directory
+                    if isinstance(f, CasBasedDirectory):
+                        directory = f
+                        print("  resolving {}: dir".format(c))
+
+                    else:
+                        # This is a file or None (i.e. broken symlink)
+                        print("  resolving {}: file/broken link".format(c))
+                        if components:
+                            # Oh dear. We have components left to resolve, but the one we're trying to resolve points to a file.
+                            raise VirtualDirectoryError("Reached a file called {} while trying to resolve a symlink; cannot proceed".format(c))
+                        else:
+                            return f
+                else:
+                    print("  resolving {}: nonexistent!".format(c))
+                    return None
+
+        # Shouldn't get here.
+        
+
     def _check_replacement(self, name, path_prefix, fileListResult):
         """ Checks whether 'name' exists, and if so, whether we can overwrite it.
         If we can, add the name to 'overwritten_files' and delete the existing entry.
@@ -541,21 +602,27 @@ class CasBasedDirectory(Directory):
         """
 
         print("Running list_relative_paths on relpath {}".format(relpath))
-        symlink_list = filter(lambda i: isinstance(i[1].pb_object, remote_execution_pb2.SymlinkNode), self.index.items())
-        file_list = filter(lambda i: isinstance(i[1].pb_object, remote_execution_pb2.FileNode), self.index.items())
+        symlink_list = list(filter(lambda i: isinstance(i[1].pb_object, remote_execution_pb2.SymlinkNode), self.index.items()))
+        file_list = list(filter(lambda i: isinstance(i[1].pb_object, remote_execution_pb2.FileNode), self.index.items()))
+        directory_list = list(filter(lambda i: isinstance(i[1].buildstream_object, CasBasedDirectory), self.index.items()))
         print("Running list_relative_paths on relpath {}. files={}, symlinks={}".format(relpath, [f[0] for f in file_list], [s[0] for s in symlink_list]))
 
         for (k, v) in sorted(symlink_list):
-            print("Yielding symlink {}".format(k))
-            yield os.path.join(relpath, k)
-        for (k, v) in sorted(file_list):
-            print("Yielding file {}".format(k))
-            yield os.path.join(relpath, k)
-        else:
+            target = self._resolve(k)
+            if isinstance(target, CasBasedDirectory):
+                print("Adding the resolved symlink {} which resolves to {} to our directory list".format(k, target))
+                directory_list.append((k,IndexEntry(k, buildstream_object=target)))
+            else:
+                # Broken symlinks are also considered files!
+                file_list.append((k,v))
+        if file_list == [] and relpath != "":
             print("Yielding empty directory name {}".format(relpath))
             yield relpath
+        else:
+            for (k, v) in sorted(file_list):
+                print("Yielding file {}".format(k))
+                yield os.path.join(relpath, k)
 
-        directory_list = filter(lambda i: isinstance(i[1].buildstream_object, CasBasedDirectory), self.index.items())
         for (k, v) in sorted(directory_list):
             print("Yielding from subdirectory name {}".format(k))
             yield from v.buildstream_object.list_relative_paths(relpath=os.path.join(relpath, k))


[buildstream] 24/43: Make the duplication test optional in cas_based_directory

Posted by gi...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a commit to branch jmac/cas_to_cas_oct_v2
in repository https://gitbox.apache.org/repos/asf/buildstream.git

commit af6b327fe61f8629d07f7c1d523f2283bf861bbd
Author: Jim MacArthur <ji...@codethink.co.uk>
AuthorDate: Thu Oct 25 16:48:23 2018 +0100

    Make the duplication test optional in cas_based_directory
---
 buildstream/storage/_casbaseddirectory.py | 35 +++++++++++++++----------------
 1 file changed, 17 insertions(+), 18 deletions(-)

diff --git a/buildstream/storage/_casbaseddirectory.py b/buildstream/storage/_casbaseddirectory.py
index ce0ec2b..f1799b6 100644
--- a/buildstream/storage/_casbaseddirectory.py
+++ b/buildstream/storage/_casbaseddirectory.py
@@ -854,35 +854,34 @@ class CasBasedDirectory(Directory):
         if self.parent:
             self.parent._recalculate_recursing_up(self)
         
-        # Duplicate the current directory
-
+        duplicate_test = False
         
         print("Original CAS before CAS-based import: {}".format(self.show_files_recursive()))
         print("Original CAS hash: {}".format(self.ref.hash))
         duplicate_cas = None
         self._verify_unique()
         if isinstance(external_pathspec, CasBasedDirectory):
-            duplicate_cas = CasBasedDirectory(self.context, ref=copy.copy(self.ref))
-            duplicate_cas._verify_unique()
+            if duplicate_test:
+                duplicate_cas = CasBasedDirectory(self.context, ref=copy.copy(self.ref))
+                duplicate_cas._verify_unique()
+                print("Duplicated CAS before file-based import: {}".format(duplicate_cas.show_files_recursive()))
+                print("Duplicate CAS hash: {}".format(duplicate_cas.ref.hash))
             print("-"*80 + "Performing direct CAS-to-CAS import")
-            print("Duplicated CAS before file-based import: {}".format(duplicate_cas.show_files_recursive()))
-            print("Duplicate CAS hash: {}".format(duplicate_cas.ref.hash))
             result = self._import_cas_into_cas(external_pathspec, files=files)
             self._verify_unique()
             print("Result of cas-to-cas import: {}".format(self.show_files_recursive()))
             print("-"*80 + "Performing round-trip import via file system")
-            with tempfile.TemporaryDirectory(prefix="roundtrip") as tmpdir:
-                external_pathspec.export_files(tmpdir)
-                if files is None:
-                    files = list(list_relative_paths(tmpdir))
-                print("Importing from filesystem: filelist is: {}".format(files))
-                duplicate_cas._import_files_from_directory(tmpdir, files=files)
-                duplicate_cas._recalculate_recursing_down()
-                if duplicate_cas.parent:
-                    duplicate_cas.parent._recalculate_recursing_up(duplicate_cas)
-                print("Result of direct import: {}".format(duplicate_cas.show_files_recursive()))
-               
-
+            if duplicate_test:
+                with tempfile.TemporaryDirectory(prefix="roundtrip") as tmpdir:
+                    external_pathspec.export_files(tmpdir)
+                    if files is None:
+                        files = list(list_relative_paths(tmpdir))
+                    print("Importing from filesystem: filelist is: {}".format(files))
+                    duplicate_cas._import_files_from_directory(tmpdir, files=files)
+                    duplicate_cas._recalculate_recursing_down()
+                    if duplicate_cas.parent:
+                        duplicate_cas.parent._recalculate_recursing_up(duplicate_cas)
+                    print("Result of direct import: {}".format(duplicate_cas.show_files_recursive()))
         else:
             print("-"*80 + "Performing initial import")
             if isinstance(external_pathspec, FileBasedDirectory):


[buildstream] 15/43: casbaseddirectory: Various fixes.

Posted by gi...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a commit to branch jmac/cas_to_cas_oct_v2
in repository https://gitbox.apache.org/repos/asf/buildstream.git

commit 6b741f35ddfafe8366d78c705bd350845dddf1cd
Author: Jim MacArthur <ji...@codethink.co.uk>
AuthorDate: Tue Oct 23 17:57:16 2018 +0100

    casbaseddirectory: Various fixes.
---
 buildstream/storage/_casbaseddirectory.py | 44 ++++++++++++++++++++++++++++---
 1 file changed, 41 insertions(+), 3 deletions(-)

diff --git a/buildstream/storage/_casbaseddirectory.py b/buildstream/storage/_casbaseddirectory.py
index a3bed3e..fd7b28a 100644
--- a/buildstream/storage/_casbaseddirectory.py
+++ b/buildstream/storage/_casbaseddirectory.py
@@ -289,9 +289,12 @@ class CasBasedDirectory(Directory):
                 return entry.descend(subdirectory_spec[1:], create)
             else:
                 # May be a symlink
+                target = self._resolve(subdirectory_spec[0])
+                if isinstance(target, CasBasedDirectory):
+                    return target
                 error = "Cannot descend into {}, which is a '{}' in the directory {}"
                 raise VirtualDirectoryError(error.format(subdirectory_spec[0],
-                                                         type(entry).__name__,
+                                                         type(self.index[subdirectory_spec[0]].pb_object).__name__,
                                                          self))
         else:
             if create:
@@ -329,6 +332,7 @@ class CasBasedDirectory(Directory):
             return self.index[name].buildstream_object
         # OK then, it's a symlink
         symlink = self._find_pb2_entry(name)
+        assert isinstance(symlink, remote_execution_pb2.SymlinkNode)
         absolute = symlink.target.startswith(CasBasedDirectory._pb2_absolute_path_prefix)
         if absolute:
             root = self.find_root()
@@ -345,6 +349,16 @@ class CasBasedDirectory(Directory):
                 directory = directory.descend(c, create=True)
         return directory
 
+    def _is_followable(self, name):
+        """ Returns true if this is a directory or symlink to a valid directory. """
+        if name not in self.index:
+            return False
+        if isinstance(self.index[name].buildstream_object, Directory):
+            return True
+        target = self._resolve(name)
+        print("Is {} followable? Resolved to {}".format(name, target))
+        return isinstance(target, CasBasedDirectory) or target is None
+
     def _resolve_symlink(self, node):
         """Same as _resolve_symlink_or_directory but takes a SymlinkNode.
         """
@@ -477,7 +491,13 @@ class CasBasedDirectory(Directory):
         """ _import_directory_recursively and _import_files_from_directory will be called alternately
         as a directory tree is descended. """
         if directory_name in self.index:
-            subdir = self._resolve_symlink_or_directory(directory_name)
+            if self._is_followable(directory_name): 
+                subdir = self._resolve_symlink_or_directory(directory_name)
+            else:
+                print("Overwriting unfollowable thing {}".format(directory_name))
+                self.delete_entry(directory_name)
+                subdir = self._add_directory(directory_name)
+                # TODO: Add this to the list of overwritten things.
         else:
             subdir = self._add_directory(directory_name)
         new_path_prefix = os.path.join(path_prefix, directory_name)
@@ -608,6 +628,12 @@ class CasBasedDirectory(Directory):
                 if dirname not in processed_directories:
                     # Now strip off the first directory name and import files recursively.
                     subcomponents = CasBasedDirectory.files_in_subdir(files, dirname)
+                    # We will fail at this point if there is a file or symlink to file called 'dirname'.
+                    if dirname in self.index:
+                        x = self._resolve(dirname)
+                        if isinstance(x, remote_execution_pb2.FileNode):
+                            self.delete_entry(dirname)
+                            result.overwritten.append(f)
                     self.create_directory(dirname)
                     print("Creating destination in {}: {}".format(self, dirname))
                     dest_subdir = self._resolve_symlink_or_directory(dirname)
@@ -689,6 +715,18 @@ class CasBasedDirectory(Directory):
             print("Extracted all files from source directory '{}': {}".format(source_directory, files))
         return self._partial_import_cas_into_cas(source_directory, list(files))
 
+    def _describe(self, thing):
+        # Describes protocol buffer objects
+        if isinstance(thing, remote_execution_pb2.DirectoryNode):
+            return "directory called {}".format(thing.name)
+        elif isinstance(thing, remote_execution_pb2.SymlinkNode):
+            return "symlink called {} pointing to {}".format(thing.name, thing.target)
+        elif isinstance(thing, remote_execution_pb2.FileNode):
+            return "file called {}".format(thing.name)
+        else:
+            return "strange thing"
+        
+    
     def showdiff(self, other):
         print("Diffing {} and {}:".format(self, other))
 
@@ -702,7 +740,7 @@ class CasBasedDirectory(Directory):
                     return False
                 item2 = l2[index]
                 if item1.name != item2.name:
-                    print("Items do not match: {} in l1, {} in l2".format(item1.name, item2.name))
+                    print("Items do not match: {}, a {} in l1, vs {}, a {} in l2".format(item1.name, self._describe(item1), item2.name, self._describe(item2)))
                     return False
                 index += 1
             if index != len(l2):


[buildstream] 28/43: casbaseddirectory: Remove roundtrip checking code

Posted by gi...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a commit to branch jmac/cas_to_cas_oct_v2
in repository https://gitbox.apache.org/repos/asf/buildstream.git

commit 7d9d395051f7e8b1b032d32c75549cc1cf181ce6
Author: Jim MacArthur <ji...@codethink.co.uk>
AuthorDate: Thu Oct 25 17:32:05 2018 +0100

    casbaseddirectory: Remove roundtrip checking code
---
 buildstream/storage/_casbaseddirectory.py | 28 +---------------------------
 1 file changed, 1 insertion(+), 27 deletions(-)

diff --git a/buildstream/storage/_casbaseddirectory.py b/buildstream/storage/_casbaseddirectory.py
index 6a25602..e30a23d 100644
--- a/buildstream/storage/_casbaseddirectory.py
+++ b/buildstream/storage/_casbaseddirectory.py
@@ -835,39 +835,17 @@ class CasBasedDirectory(Directory):
 
         print("Directory before import: {}".format(self.show_files_recursive()))
 
-        # Sync self
+        # Sync self (necessary?)
         self._recalculate_recursing_down()
         if self.parent:
             self.parent._recalculate_recursing_up(self)
         
-        duplicate_test = False
-        
-        print("Original CAS before CAS-based import: {}".format(self.show_files_recursive()))
-        print("Original CAS hash: {}".format(self.ref.hash))
-        duplicate_cas = None
         self._verify_unique()
         if isinstance(external_pathspec, CasBasedDirectory):
-            if duplicate_test:
-                duplicate_cas = CasBasedDirectory(self.context, ref=copy.copy(self.ref))
-                duplicate_cas._verify_unique()
-                print("Duplicated CAS before file-based import: {}".format(duplicate_cas.show_files_recursive()))
-                print("Duplicate CAS hash: {}".format(duplicate_cas.ref.hash))
             print("-"*80 + "Performing direct CAS-to-CAS import")
             result = self._import_cas_into_cas(external_pathspec, files=files)
             self._verify_unique()
             print("Result of cas-to-cas import: {}".format(self.show_files_recursive()))
-            print("-"*80 + "Performing round-trip import via file system")
-            if duplicate_test:
-                with tempfile.TemporaryDirectory(prefix="roundtrip") as tmpdir:
-                    external_pathspec.export_files(tmpdir)
-                    if files is None:
-                        files = list(list_relative_paths(tmpdir))
-                    print("Importing from filesystem: filelist is: {}".format(files))
-                    duplicate_cas._import_files_from_directory(tmpdir, files=files)
-                    duplicate_cas._recalculate_recursing_down()
-                    if duplicate_cas.parent:
-                        duplicate_cas.parent._recalculate_recursing_up(duplicate_cas)
-                    print("Result of direct import: {}".format(duplicate_cas.show_files_recursive()))
         else:
             print("-"*80 + "Performing initial import")
             if isinstance(external_pathspec, FileBasedDirectory):
@@ -891,10 +869,6 @@ class CasBasedDirectory(Directory):
         self._recalculate_recursing_down()
         if self.parent:
             self.parent._recalculate_recursing_up(self)
-        if duplicate_cas:
-            if duplicate_cas.ref.hash != self.ref.hash:
-                self.showdiff(duplicate_cas)
-                raise VirtualDirectoryError("Mismatch between file-imported result {} and cas-to-cas imported result {}.".format(duplicate_cas.ref.hash,self.ref.hash))
 
         return result
 


[buildstream] 38/43: Remove some prints and whitespace

Posted by gi...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a commit to branch jmac/cas_to_cas_oct_v2
in repository https://gitbox.apache.org/repos/asf/buildstream.git

commit 7bb8f20f9f955a68fb6c70f6425cc2e48fbd87d4
Author: Jim MacArthur <ji...@codethink.co.uk>
AuthorDate: Tue Oct 30 11:17:21 2018 +0000

    Remove some prints and whitespace
---
 buildstream/storage/_casbaseddirectory.py | 5 +----
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/buildstream/storage/_casbaseddirectory.py b/buildstream/storage/_casbaseddirectory.py
index 75184d2..40c506d 100644
--- a/buildstream/storage/_casbaseddirectory.py
+++ b/buildstream/storage/_casbaseddirectory.py
@@ -87,9 +87,7 @@ class CasBasedDirectory(Directory):
         if ref:
             with open(self.cas_cache.objpath(ref), 'rb') as f:
                 self.pb2_directory.ParseFromString(f.read())
-                print("Opening ref {} and parsed into directory containing: {} {} {}.".format(ref.hash, [d.name for d in self.pb2_directory.directories],
-                                                                                        [d.name for d in self.pb2_directory.symlinks],
-                                                                                        [d.name for d in self.pb2_directory.files]))
+
         self.ref = ref
         self.index = OrderedDict()
         self.parent = parent
@@ -141,7 +139,6 @@ class CasBasedDirectory(Directory):
         # We don't need to do anything more than that; files were already added ealier, and symlinks are
         # part of the directory structure.
 
-
     def _find_pb2_entry(self, name):
         if name in self.index:
             return self.index[name].pb_object


[buildstream] 01/43: buildstream/storage/__init__.py: import CasBasedDirectory

Posted by gi...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a commit to branch jmac/cas_to_cas_oct_v2
in repository https://gitbox.apache.org/repos/asf/buildstream.git

commit b8330111164eadaf3d9ff42957d541d7018d95d3
Author: Jim MacArthur <ji...@codethink.co.uk>
AuthorDate: Thu Sep 20 18:33:49 2018 +0100

    buildstream/storage/__init__.py: import CasBasedDirectory
---
 buildstream/storage/__init__.py | 1 +
 1 file changed, 1 insertion(+)

diff --git a/buildstream/storage/__init__.py b/buildstream/storage/__init__.py
index 33424ac..5571cd8 100644
--- a/buildstream/storage/__init__.py
+++ b/buildstream/storage/__init__.py
@@ -19,4 +19,5 @@
 #        Jim MacArthur <ji...@codethink.co.uk>
 
 from ._filebaseddirectory import FileBasedDirectory
+from ._casbaseddirectory import CasBasedDirectory
 from .directory import Directory


[buildstream] 32/43: Add a main() to virtual_directory_test.py to allow manual testing

Posted by gi...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a commit to branch jmac/cas_to_cas_oct_v2
in repository https://gitbox.apache.org/repos/asf/buildstream.git

commit 505c6ddcfb59fa885b9aa7afa79da39ae1fd2ef2
Author: Jim MacArthur <ji...@codethink.co.uk>
AuthorDate: Mon Oct 29 14:25:44 2018 +0000

    Add a main() to virtual_directory_test.py to allow manual testing
---
 tests/storage/virtual_directory_import.py | 34 ++++++++++++++++++++++++++++++-
 1 file changed, 33 insertions(+), 1 deletion(-)

diff --git a/tests/storage/virtual_directory_import.py b/tests/storage/virtual_directory_import.py
index 70e3c9a..fcfcc37 100644
--- a/tests/storage/virtual_directory_import.py
+++ b/tests/storage/virtual_directory_import.py
@@ -8,12 +8,14 @@ from tests.testutils import cli
 
 from buildstream.storage import CasBasedDirectory
 from buildstream.storage import FileBasedDirectory
+from buildstream._artifactcache import ArtifactCache
+from buildstream._artifactcache.cascache import CASCache
 from buildstream import utils
 
 class FakeContext():
     def __init__(self):
         self.config_cache_quota = "65536"
-
+        
     def get_projects(self):
         return []
 
@@ -156,6 +158,8 @@ def directory_not_empty(path):
 def _import_test(tmpdir, original, overlay, generator_function, verify_contents=False):
     fake_context = FakeContext()
     fake_context.artifactdir = tmpdir
+    print("Creating CAS Cache with artifact dir {}".format(tmpdir))
+    fake_context.artifactcache = CASCache(fake_context)
     # Create some fake content
     generator_function(original, tmpdir)
     if original != overlay:
@@ -228,6 +232,8 @@ def test_random_cas_import_fast(cli, tmpdir, original, overlay):
 def _listing_test(tmpdir, root, generator_function):
     fake_context = FakeContext()
     fake_context.artifactdir = tmpdir
+    print("Creating CAS Cache with artifact dir {}".format(tmpdir))
+    fake_context.artifactcache = CASCache(fake_context)
     # Create some fake content
     generator_function(root, tmpdir)
 
@@ -251,3 +257,29 @@ def test_random_directory_listing(cli, tmpdir, root):
 @pytest.mark.parametrize("root", [1, 2, 3, 4, 5])
 def test_fixed_directory_listing(cli, tmpdir, root):
     _listing_test(tmpdir, root, generate_import_roots)
+
+
+
+
+def main():
+    for i in range(1,6):
+        with tempfile.TemporaryDirectory(prefix="/home/jimmacarthur/.cache/buildstream/cas") as tmpdirname:
+            test_fixed_directory_listing(None, tmpdirname, i)
+            
+    for i in range(1,11):
+        with tempfile.TemporaryDirectory(prefix="/home/jimmacarthur/.cache/buildstream/cas") as tmpdirname:
+            test_random_directory_listing(None, tmpdirname, i)
+
+    for i in range(1,21):
+        for j in range(1,21):
+            with tempfile.TemporaryDirectory(prefix="/home/jimmacarthur/.cache/buildstream/cas") as tmpdirname:
+                test_random_cas_import_fast(None, tmpdirname, i, j)
+
+    for i in range(1,len(root_filesets)+1):
+        for j in range(1,len(root_filesets)+1):
+            with tempfile.TemporaryDirectory(prefix="/home/jimmacarthur/.cache/buildstream/cas") as tmpdirname:
+                test_fixed_cas_import(None, tmpdirname, i, j)
+                
+                
+if __name__=="__main__":
+    main()


[buildstream] 27/43: Make virtual_directory_test do the cas roundtrip test instead of _casbaseddirectory

Posted by gi...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a commit to branch jmac/cas_to_cas_oct_v2
in repository https://gitbox.apache.org/repos/asf/buildstream.git

commit ac897f8ce6491e1e57ea05e9714a1d4567c1f2ce
Author: Jim MacArthur <ji...@codethink.co.uk>
AuthorDate: Thu Oct 25 17:26:21 2018 +0100

    Make virtual_directory_test do the cas roundtrip test instead of _casbaseddirectory
---
 tests/storage/virtual_directory_import.py | 38 ++++++++++++++++++++++++-------
 1 file changed, 30 insertions(+), 8 deletions(-)

diff --git a/tests/storage/virtual_directory_import.py b/tests/storage/virtual_directory_import.py
index 9207193..70e3c9a 100644
--- a/tests/storage/virtual_directory_import.py
+++ b/tests/storage/virtual_directory_import.py
@@ -1,11 +1,14 @@
 import os
 import pytest
 import random
+import copy
+import tempfile
 from tests.testutils import cli
 
+
 from buildstream.storage import CasBasedDirectory
 from buildstream.storage import FileBasedDirectory
-
+from buildstream import utils
 
 class FakeContext():
     def __init__(self):
@@ -84,7 +87,6 @@ def generate_random_root(rootno, directory):
                 os.symlink(symlink_destination, target)
                 description = "symlink pointing to {}".format(symlink_destination)
         things.append(os.path.join(location, thingname))
-        print("Generated {}/{}, a {}".format(rootdir, things[-1], description))
 
 
 def file_contents(path):
@@ -160,15 +162,24 @@ def _import_test(tmpdir, original, overlay, generator_function, verify_contents=
         generator_function(overlay, tmpdir)
         
     d = create_new_casdir(original, fake_context, tmpdir)
+
+    #duplicate_cas = CasBasedDirectory(fake_context, ref=copy.copy(d.ref))
+    duplicate_cas = create_new_casdir(original, fake_context, tmpdir)
+
+    assert duplicate_cas.ref.hash == d.ref.hash
+
     d2 = create_new_casdir(overlay, fake_context, tmpdir)
     print("Importing dir {} into {}".format(overlay, original))
     d.import_files(d2)
-    d.export_files(os.path.join(tmpdir, "output"))
+    export_dir = os.path.join(tmpdir, "output")
+    roundtrip_dir = os.path.join(tmpdir, "roundtrip")
+    d2.export_files(roundtrip_dir)
+    d.export_files(export_dir)
     
     if verify_contents:
         for item in root_filesets[overlay - 1]:
             (path, typename, content) = item
-            realpath = resolve_symlinks(path, os.path.join(tmpdir, "output"))
+            realpath = resolve_symlinks(path, export_dir)
             if typename == 'F':
                 if os.path.isdir(realpath) and directory_not_empty(realpath):
                     # The file should not have overwritten the directory in this case.
@@ -189,10 +200,21 @@ def _import_test(tmpdir, original, overlay, generator_function, verify_contents=
                 assert os.path.lexists(realpath)
 
     # Now do the same thing with filebaseddirectories and check the contents match
-    d3 = create_new_casdir(original, fake_context, tmpdir)
-    d4 = create_new_filedir(overlay, tmpdir)
-    d3.import_files(d2)
-    assert d.ref.hash == d3.ref.hash
+
+    files = list(utils.list_relative_paths(roundtrip_dir))
+    print("Importing from filesystem: filelist is: {}".format(files))
+    duplicate_cas._import_files_from_directory(roundtrip_dir, files=files)
+    duplicate_cas._recalculate_recursing_down()
+    if duplicate_cas.parent:
+        duplicate_cas.parent._recalculate_recursing_up(duplicate_cas)
+        print("Result of direct import: {}".format(duplicate_cas.show_files_recursive()))
+
+    assert duplicate_cas.ref.hash == d.ref.hash
+
+    #d3 = create_new_casdir(original, fake_context, tmpdir)
+    #d4 = create_new_filedir(overlay, tmpdir)
+    #d3.import_files(d2)
+    #assert d.ref.hash == d3.ref.hash
 
 @pytest.mark.parametrize("original,overlay", combinations(range(1,len(root_filesets)+1)))
 def test_fixed_cas_import(cli, tmpdir, original, overlay):


[buildstream] 34/43: _casbaseddirectory.py: _resolve_symlink_or_directory -> _force_resolve

Posted by gi...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a commit to branch jmac/cas_to_cas_oct_v2
in repository https://gitbox.apache.org/repos/asf/buildstream.git

commit f206850543880e221f230fad43fad38df93a090d
Author: Jim MacArthur <ji...@codethink.co.uk>
AuthorDate: Tue Oct 30 09:59:02 2018 +0000

    _casbaseddirectory.py: _resolve_symlink_or_directory -> _force_resolve
---
 buildstream/storage/_casbaseddirectory.py | 21 ++++++++-------------
 1 file changed, 8 insertions(+), 13 deletions(-)

diff --git a/buildstream/storage/_casbaseddirectory.py b/buildstream/storage/_casbaseddirectory.py
index 16899e9..d9fa2da 100644
--- a/buildstream/storage/_casbaseddirectory.py
+++ b/buildstream/storage/_casbaseddirectory.py
@@ -156,8 +156,8 @@ class CasBasedDirectory(Directory):
             self.delete_entry(name)
         elif isinstance(existing_item, remote_execution_pb2.SymlinkNode):
             # Directory imported over symlink with same source name
-            if self.symlink_target_is_directory(existing_item):
-                return self._resolve_symlink_or_directory(name) # That's fine; any files in the source directory should end up at the target of the symlink.
+            if self._symlink_target_is_directory(existing_item):
+                return self._force_resolve(name) # That's fine; any files in the source directory should end up at the target of the symlink.
             else:
                 self.delete_entry(name) # Symlinks to files get replaced
         return self.descend(name, create=True) # Creates the directory if it doesn't already exist.
@@ -303,7 +303,7 @@ class CasBasedDirectory(Directory):
         else:
             return self
 
-    def _resolve_symlink_or_directory(self, name):
+    def _force_resolve(self, name):
         """Used only by _import_files_from_directory. Tries to resolve a
         directory name or symlink name. 'name' must be an entry in this
         directory. It must be a single symlink or directory name, not a path
@@ -328,11 +328,6 @@ class CasBasedDirectory(Directory):
         print("Is {} followable? Resolved to {}".format(name, target))
         return isinstance(target, CasBasedDirectory) or target is None
 
-    def _resolve_symlink(self, node, force_create=True):
-        """Same as _resolve_symlink_or_directory but takes a SymlinkNode.
-        """
-        return self._resolve(node.name, force_create=True)
-    
     def _resolve(self, name, absolute_symlinks_resolve=True, force_create=False, first_seen_object = None):
         """ Resolves any name to an object. If the name points to a symlink in
         this directory, it returns the thing it points to,
@@ -458,7 +453,7 @@ class CasBasedDirectory(Directory):
         as a directory tree is descended. """
         if directory_name in self.index:
             if self._is_followable(directory_name): 
-                subdir = self._resolve_symlink_or_directory(directory_name)
+                subdir = self._force_resolve(directory_name)
             else:
                 print("Overwriting unfollowable thing {}".format(directory_name))
                 self.delete_entry(directory_name)
@@ -513,8 +508,8 @@ class CasBasedDirectory(Directory):
             dirname += os.path.sep
         return [f[len(dirname):] for f in sorted_files if f.startswith(dirname)]
 
-    def symlink_target_is_directory(self, symlink_node):
-        x = self._resolve_symlink(symlink_node, force_create=False)
+    def _symlink_target_is_directory(self, symlink_node):
+        x = self._resolve(symlink_node.name)
         return isinstance(x, CasBasedDirectory)
 
     def _partial_import_cas_into_cas(self, source_directory, files, path_prefix="", file_list_required=True):
@@ -547,8 +542,8 @@ class CasBasedDirectory(Directory):
                         else:
                             dest_subdir = x
                     else:
-                        self.create_directory(dirname)
-                        dest_subdir = self._resolve_symlink_or_directory(dirname)
+                        self.create_directory(dirname) # Unnecssary? Why force_resolve if we resolve?
+                        dest_subdir = self._force_resolve(dirname)
                     src_subdir = source_directory.descend(dirname)
                     import_result = dest_subdir._partial_import_cas_into_cas(src_subdir, subcomponents,
                                                                              path_prefix=fullname, file_list_required=file_list_required)


[buildstream] 37/43: virtual_directory_test: PEP8

Posted by gi...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a commit to branch jmac/cas_to_cas_oct_v2
in repository https://gitbox.apache.org/repos/asf/buildstream.git

commit 3cced840a613201a19ed272475c7b0fc6ffd84f6
Author: Jim MacArthur <ji...@codethink.co.uk>
AuthorDate: Tue Oct 30 11:05:19 2018 +0000

    virtual_directory_test: PEP8
---
 tests/storage/virtual_directory_import.py | 60 +++++++++++++++----------------
 1 file changed, 29 insertions(+), 31 deletions(-)

diff --git a/tests/storage/virtual_directory_import.py b/tests/storage/virtual_directory_import.py
index fcfcc37..b0a899c 100644
--- a/tests/storage/virtual_directory_import.py
+++ b/tests/storage/virtual_directory_import.py
@@ -12,10 +12,11 @@ from buildstream._artifactcache import ArtifactCache
 from buildstream._artifactcache.cascache import CASCache
 from buildstream import utils
 
+
 class FakeContext():
     def __init__(self):
         self.config_cache_quota = "65536"
-        
+
     def get_projects(self):
         return []
 
@@ -31,8 +32,8 @@ root_filesets = [
     [('a/b/d', 'D', '')],
     [('a/b/c', 'S', '/a/b/d')],
     [('a/b/d', 'S', '/a/b/c')],
-    [('a/b/d', 'D', ''), ('a/b/c', 'S', '/a/b/d')], 
-    [('a/b/c', 'D', ''), ('a/b/d', 'S', '/a/b/c')], 
+    [('a/b/d', 'D', ''), ('a/b/c', 'S', '/a/b/d')],
+    [('a/b/c', 'D', ''), ('a/b/d', 'S', '/a/b/c')],
     [('a/b', 'F', 'This is textfile 1\n')],
     [('a/b/c', 'F', 'This is textfile 1\n')],
     [('a/b/c', 'D', '')]
@@ -61,7 +62,7 @@ def generate_import_roots(rootno, directory):
 
 
 def generate_random_root(rootno, directory):
-    random.seed(RANDOM_SEED+rootno)
+    random.seed(RANDOM_SEED + rootno)
     rootname = "root{}".format(rootno)
     rootdir = os.path.join(directory, "content", rootname)
     things = []
@@ -164,10 +165,9 @@ def _import_test(tmpdir, original, overlay, generator_function, verify_contents=
     generator_function(original, tmpdir)
     if original != overlay:
         generator_function(overlay, tmpdir)
-        
+
     d = create_new_casdir(original, fake_context, tmpdir)
 
-    #duplicate_cas = CasBasedDirectory(fake_context, ref=copy.copy(d.ref))
     duplicate_cas = create_new_casdir(original, fake_context, tmpdir)
 
     assert duplicate_cas.ref.hash == d.ref.hash
@@ -179,7 +179,7 @@ def _import_test(tmpdir, original, overlay, generator_function, verify_contents=
     roundtrip_dir = os.path.join(tmpdir, "roundtrip")
     d2.export_files(roundtrip_dir)
     d.export_files(export_dir)
-    
+
     if verify_contents:
         for item in root_filesets[overlay - 1]:
             (path, typename, content) = item
@@ -199,8 +199,10 @@ def _import_test(tmpdir, original, overlay, generator_function, verify_contents=
                     assert os.path.islink(realpath)
                     assert os.readlink(realpath) == content
             elif typename == 'D':
-                # We can't do any more tests than this because it depends on things present in the original. Blank directories
-                # here will be ignored and the original left in place.
+                # We can't do any more tests than this because it
+                # depends on things present in the original. Blank
+                # directories here will be ignored and the original
+                # left in place.
                 assert os.path.lexists(realpath)
 
     # Now do the same thing with filebaseddirectories and check the contents match
@@ -215,20 +217,17 @@ def _import_test(tmpdir, original, overlay, generator_function, verify_contents=
 
     assert duplicate_cas.ref.hash == d.ref.hash
 
-    #d3 = create_new_casdir(original, fake_context, tmpdir)
-    #d4 = create_new_filedir(overlay, tmpdir)
-    #d3.import_files(d2)
-    #assert d.ref.hash == d3.ref.hash
 
-@pytest.mark.parametrize("original,overlay", combinations(range(1,len(root_filesets)+1)))
+@pytest.mark.parametrize("original,overlay", combinations(range(1, len(root_filesets) + 1)))
 def test_fixed_cas_import(cli, tmpdir, original, overlay):
     _import_test(tmpdir, original, overlay, generate_import_roots, verify_contents=True)
 
-@pytest.mark.parametrize("original,overlay", combinations(range(1,11)))
+
+@pytest.mark.parametrize("original,overlay", combinations(range(1, 11)))
 def test_random_cas_import_fast(cli, tmpdir, original, overlay):
     _import_test(tmpdir, original, overlay, generate_random_root, verify_contents=False)
 
-    
+
 def _listing_test(tmpdir, root, generator_function):
     fake_context = FakeContext()
     fake_context.artifactdir = tmpdir
@@ -248,38 +247,37 @@ def _listing_test(tmpdir, root, generator_function):
     print("filelist for root {} via CasBasedDirectory:".format(root))
     print("{}".format(filelist2))
     assert filelist == filelist2
-    
 
-@pytest.mark.parametrize("root", range(1,11))
+
+@pytest.mark.parametrize("root", range(1, 11))
 def test_random_directory_listing(cli, tmpdir, root):
     _listing_test(tmpdir, root, generate_random_root)
-    
+
+
 @pytest.mark.parametrize("root", [1, 2, 3, 4, 5])
 def test_fixed_directory_listing(cli, tmpdir, root):
     _listing_test(tmpdir, root, generate_import_roots)
 
 
-
-
 def main():
-    for i in range(1,6):
+    for i in range(1, 6):
         with tempfile.TemporaryDirectory(prefix="/home/jimmacarthur/.cache/buildstream/cas") as tmpdirname:
             test_fixed_directory_listing(None, tmpdirname, i)
-            
-    for i in range(1,11):
+
+    for i in range(1, 11):
         with tempfile.TemporaryDirectory(prefix="/home/jimmacarthur/.cache/buildstream/cas") as tmpdirname:
             test_random_directory_listing(None, tmpdirname, i)
 
-    for i in range(1,21):
-        for j in range(1,21):
+    for i in range(1, 21):
+        for j in range(1, 21):
             with tempfile.TemporaryDirectory(prefix="/home/jimmacarthur/.cache/buildstream/cas") as tmpdirname:
                 test_random_cas_import_fast(None, tmpdirname, i, j)
 
-    for i in range(1,len(root_filesets)+1):
-        for j in range(1,len(root_filesets)+1):
+    for i in range(1, len(root_filesets) + 1):
+        for j in range(1, len(root_filesets) + 1):
             with tempfile.TemporaryDirectory(prefix="/home/jimmacarthur/.cache/buildstream/cas") as tmpdirname:
                 test_fixed_cas_import(None, tmpdirname, i, j)
-                
-                
-if __name__=="__main__":
+
+
+if __name__ == "__main__":
     main()


[buildstream] 35/43: casbaseddirectory: Replace one instance of _force_resolve with descend

Posted by gi...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a commit to branch jmac/cas_to_cas_oct_v2
in repository https://gitbox.apache.org/repos/asf/buildstream.git

commit 1a0669a21940ef93cf3d2c9b702cbbdec73b733c
Author: Jim MacArthur <ji...@codethink.co.uk>
AuthorDate: Tue Oct 30 10:00:26 2018 +0000

    casbaseddirectory: Replace one instance of _force_resolve with descend
---
 buildstream/storage/_casbaseddirectory.py | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/buildstream/storage/_casbaseddirectory.py b/buildstream/storage/_casbaseddirectory.py
index d9fa2da..092cc52 100644
--- a/buildstream/storage/_casbaseddirectory.py
+++ b/buildstream/storage/_casbaseddirectory.py
@@ -542,8 +542,7 @@ class CasBasedDirectory(Directory):
                         else:
                             dest_subdir = x
                     else:
-                        self.create_directory(dirname) # Unnecssary? Why force_resolve if we resolve?
-                        dest_subdir = self._force_resolve(dirname)
+                        dest_subdir = self.descend(dirname, create=True)
                     src_subdir = source_directory.descend(dirname)
                     import_result = dest_subdir._partial_import_cas_into_cas(src_subdir, subcomponents,
                                                                              path_prefix=fullname, file_list_required=file_list_required)


[buildstream] 33/43: CasBasedDirectory: Remove 6 functions and rename files_in_subdir -> _files_in_subdir

Posted by gi...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a commit to branch jmac/cas_to_cas_oct_v2
in repository https://gitbox.apache.org/repos/asf/buildstream.git

commit 0f429c9bc2c9412e62aa053de7d9ed7d61824957
Author: Jim MacArthur <ji...@codethink.co.uk>
AuthorDate: Tue Oct 30 09:39:42 2018 +0000

    CasBasedDirectory: Remove 6 functions and rename files_in_subdir -> _files_in_subdir
---
 buildstream/storage/_casbaseddirectory.py | 100 +-----------------------------
 1 file changed, 2 insertions(+), 98 deletions(-)

diff --git a/buildstream/storage/_casbaseddirectory.py b/buildstream/storage/_casbaseddirectory.py
index a7d656d..16899e9 100644
--- a/buildstream/storage/_casbaseddirectory.py
+++ b/buildstream/storage/_casbaseddirectory.py
@@ -141,19 +141,6 @@ class CasBasedDirectory(Directory):
         # We don't need to do anything more than that; files were already added ealier, and symlinks are
         # part of the directory structure.
 
-    def _add_new_blank_directory(self, name) -> Directory:
-        bst_dir = CasBasedDirectory(self.context, parent=self, filename=name)
-        new_pb2_dirnode = self.pb2_directory.directories.add()
-        new_pb2_dirnode.name = name
-        # Calculate the hash for an empty directory
-        if name in self.index:
-            raise VirtualDirectoryError("Creating directory {} would overwrite an existing item in {}"
-                                        .format(name, str(self)))
-        new_pb2_directory = remote_execution_pb2.Directory()
-        self.cas_cache.add_object(digest=new_pb2_dirnode.digest, buffer=new_pb2_directory.SerializeToString())
-        self.index[name] = IndexEntry(new_pb2_dirnode, buildstream_object=bst_dir)
-        return bst_dir
-
     def create_directory(self, name: str) -> Directory:
         """Creates a directory if it does not already exist. This does not
         cause an error if something exists; it will remove files and
@@ -517,47 +504,7 @@ class CasBasedDirectory(Directory):
         return result
 
 
-    def _save(self, name):
-        """ Saves this directory into the content cache as a named ref. This function is not
-        currently in use, but may be useful later. """
-        self._recalculate_recursing_up()
-        self._recalculate_recursing_down()
-        (rel_refpath, refname) = os.path.split(name)
-        refdir = os.path.join(self.cas_directory, 'refs', 'heads', rel_refpath)
-        refname = os.path.join(refdir, refname)
-
-        if not os.path.exists(refdir):
-            os.makedirs(refdir)
-        with open(refname, "wb") as f:
-            f.write(self.ref.SerializeToString())
-
-    def find_updated_files(self, modified_directory, prefix=""):
-        """Find the list of written and overwritten files that would result
-        from importing 'modified_directory' into this one.  This does
-        not change either directory. The reason this exists is for
-        direct imports of cas directories into other ones, which can
-        be done by simply replacing a hash, but we still need the file
-        lists.
-
-        """
-        result = FileListResult()
-        for entry in modified_directory.pb2_directory.directories:
-            existing_dir = self._find_pb2_entry(entry.name)
-            if existing_dir:
-                updates_files = existing_dir.find_updated_files(modified_directory.descend(entry.name),
-                                                                os.path.join(prefix, entry.name))
-                result.combine(updated_files)
-            else:
-                for f in source_directory.descend(entry.name).list_relative_paths():
-                    result.files_written.append(os.path.join(prefix, f))
-                    # None of these can overwrite anything, since the original files don't exist
-        for entry in modified_directory.pb2_directory.files + modified_directory.pb2_directory.symlinks:
-            if self._find_pb2_entry(entry.name):
-                result.files_overwritten.apppend(os.path.join(prefix, entry.name))
-            result.file_written.apppend(os.path.join(prefix, entry.name))
-        return result
-
-    def files_in_subdir(sorted_files, dirname):
+    def _files_in_subdir(sorted_files, dirname):
         """Filters sorted_files and returns only the ones which have
            'dirname' as a prefix, with that prefix removed.
 
@@ -589,7 +536,7 @@ class CasBasedDirectory(Directory):
                 dirname = components[0]
                 if dirname not in processed_directories:
                     # Now strip off the first directory name and import files recursively.
-                    subcomponents = CasBasedDirectory.files_in_subdir(files, dirname)
+                    subcomponents = CasBasedDirectory._files_in_subdir(files, dirname)
                     # We will fail at this point if there is a file or symlink to file called 'dirname'.
                     if dirname in self.index:
                         x = self._resolve(dirname, force_create=True)
@@ -639,43 +586,6 @@ class CasBasedDirectory(Directory):
                         self._add_new_link_direct(name=f, target=item.target)
         return result
 
-    def transfer_node_contents(destination, source):
-        """Transfers all fields from the source PB2 node into the
-        destination. Destination and source must be of the same type and must
-        be a FileNode, SymlinkNode or DirectoryNode.
-        """
-        assert(type(destination) == type(source))
-        destination.name = source.name
-        if isinstance(destination, remote_execution_pb2.FileNode):
-            destination.digest.hash = source.digest.hash
-            destination.digest.size_bytes = source.digest.size_bytes
-            destination.is_executable = source.is_executable
-        elif isinstance(destination, remote_execution_pb2.SymlinkNode):
-            destination.target = source.target
-        elif isinstance(destination, remote_execution_pb2.DirectoryNode):
-            destination.digest.hash = source.digest.hash
-            destination.digest.size_bytes = source.digest.size_bytes
-        else:
-            raise VirtualDirectoryError("Incompatible type '{}' used as destination for transfer_node_contents"
-                                        .format(destination.type))
-
-    def _add_directory_from_node(self, source_node, source_casdir, can_hardlink=False):
-        # Duplicate the given node and add it to our index with a CasBasedDirectory object.
-        # No existing entry with the source node's name can exist.
-        # source_casdir is only needed if can_hardlink is True.
-        assert(self._find_pb2_entry(source_node.name) is None)
-
-        if can_hardlink:
-            new_dir_node = self.pb2_directory.directories.add()
-            CasBasedDirectory.transfer_node_contents(new_dir_node, source_node)
-            self.index[source_node.name] = IndexEntry(source_node, buildstream_object=source_casdir, modified=True)
-        else:
-            new_dir_node = self.pb2_directory.directories.add()
-            CasBasedDirectory.transfer_node_contents(new_dir_node, source_node)
-            buildStreamDirectory = CasBasedDirectory(self.context, ref=source_node.digest,
-                                                     parent=self, filename=source_node.name)
-            self.index[source_node.name] = IndexEntry(source_node, buildstream_object=buildStreamDirectory, modified=True)
-
     def _import_cas_into_cas(self, source_directory, files=None):
         """ A full import is significantly quicker than a partial import, because we can just
         replace one directory with another's hash, without doing any recursion.
@@ -919,12 +829,6 @@ class CasBasedDirectory(Directory):
                 filelist.append(k)
         return filelist
 
-    def _contains_only_directories(self):
-        for (k, v) in self.index.items():
-            if not isinstance(v.buildstream_object, CasBasedDirectory):
-                return False
-        return True
-
     def list_relative_paths(self, relpath=""):
         """Provide a list of all relative paths.
 


[buildstream] 41/43: _casbaseddirectory: Restructure resolve to make it a bit more logical

Posted by gi...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a commit to branch jmac/cas_to_cas_oct_v2
in repository https://gitbox.apache.org/repos/asf/buildstream.git

commit c9cfcf0291ed349797d27ec17d3824f90a8acd6c
Author: Jim MacArthur <ji...@codethink.co.uk>
AuthorDate: Tue Oct 30 12:19:23 2018 +0000

    _casbaseddirectory: Restructure resolve to make it a bit more logical
---
 buildstream/storage/_casbaseddirectory.py | 30 ++++++++++++++++++------------
 1 file changed, 18 insertions(+), 12 deletions(-)

diff --git a/buildstream/storage/_casbaseddirectory.py b/buildstream/storage/_casbaseddirectory.py
index a868c52..cd06649 100644
--- a/buildstream/storage/_casbaseddirectory.py
+++ b/buildstream/storage/_casbaseddirectory.py
@@ -365,26 +365,32 @@ class CasBasedDirectory(Directory):
 
                     if isinstance(f, CasBasedDirectory):
                         directory = f
-
+                    elif isinstance(f, remote_execution_pb2.FileNode):
+                        # F is a file
+                        if components:
+                            # We have components still to resolve, but one of the path components
+                            # is a file.
+                            if force_create:
+                                self.delete_entry(c)
+                                directory = directory.descend(c, create=True)
+                            else:
+                                return f # TODO: Why return f? We've got components left and hit a file; this should be an error.
+                                #raise VirtualDirectoryError("Reached a file called {} while trying to resolve a symlink; cannot proceed".format(c))
+                        else:
+                            # It's a file, but there's no components left, so just return that.
+                            return f
                     else:
-                        # This is a file or None (i.e. broken symlink)
-                        if f is None and force_create:
+                        # f is none, which covers many cases
+                        if force_create:
                             directory = directory.descend(c, create=True)
-                        elif components and force_create:
-                            # Oh dear. We have components left to resolve, but the one we're trying to resolve points to a file.
-                            print("Trying to resolve {}, but found {} was a file.".format(symlink.target, c))
-                            self.delete_entry(c)
-                            directory = directory.descend(c, create=True)
-                            #raise VirtualDirectoryError("Reached a file called {} while trying to resolve a symlink; cannot proceed".format(c))
                         else:
-                            return f
+                            return None
                 else:
                     if force_create:
                         directory = directory.descend(c, create=True)
                     else:
                         return None
-
-        # Shouldn't get here.
+        # You can only exit the while loop with a return, so you shouldn't be here.
         
 
     def _check_replacement(self, name, path_prefix, fileListResult):


[buildstream] 12/43: Add a tool to show differences in two CAS directories

Posted by gi...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a commit to branch jmac/cas_to_cas_oct_v2
in repository https://gitbox.apache.org/repos/asf/buildstream.git

commit 2a4710570a617a3c228085004e382f70f68c363e
Author: Jim MacArthur <ji...@codethink.co.uk>
AuthorDate: Fri Oct 19 18:28:45 2018 +0100

    Add a tool to show differences in two CAS directories
---
 buildstream/storage/_casbaseddirectory.py | 50 ++++++++++++++++++++++++++++---
 1 file changed, 46 insertions(+), 4 deletions(-)

diff --git a/buildstream/storage/_casbaseddirectory.py b/buildstream/storage/_casbaseddirectory.py
index 69a3608..f624e34 100644
--- a/buildstream/storage/_casbaseddirectory.py
+++ b/buildstream/storage/_casbaseddirectory.py
@@ -349,8 +349,13 @@ class CasBasedDirectory(Directory):
 
     
     def _resolve(self, name, absolute_symlinks_resolve=True):
-        """ Resolves any name to an object. If the name points to a symlink in this 
-        directory, it returns the thing it points to, recursively. Returns a CasBasedDirectory, FileNode or None. Never creates a directory or otherwise alters the directory. """
+        """ Resolves any name to an object. If the name points to a symlink in
+        this directory, it returns the thing it points to,
+        recursively. Returns a CasBasedDirectory, FileNode or
+        None. Never creates a directory or otherwise alters the
+        directory.
+
+        """
         # First check if it's a normal object and return that
 
         if name not in self.index:
@@ -409,7 +414,7 @@ class CasBasedDirectory(Directory):
                         else:
                             return f
                 else:
-                    print("  resolving {}: nonexistent!".format(c))
+                    print("  resolving {}: Broken symlink".format(c))
                     return None
 
         # Shouldn't get here.
@@ -637,7 +642,43 @@ class CasBasedDirectory(Directory):
             print("Extracted all files from source directory '{}': {}".format(source_directory, files))
         return self._partial_import_cas_into_cas(source_directory, files)
 
-
+    def showdiff(self, other):
+        print("Diffing {} and {}:".format(self, other))
+        l1 = list(self.index.items())
+        l2 = list(other.index.items())
+        for (key, value) in l1:
+            if len(l2) == 0:
+                print("'Other' is short: no item to correspond to '{}' in first.".format(key))
+                return
+            (key2, value2) = l2.pop(0)
+            if key != key2:
+                print("Mismatch: item named {} in first, named {} in second".format(key, key2))
+                return
+            if type(value.pb_object) != type(value2.pb_object):
+                print("Mismatch: item named {}'s pb_object is a {} in first and a {} in second".format(key, type(value.pb_object), type(value2.pb_object)))
+                return
+            if type(value.buildstream_object) != type(value2.buildstream_object):
+                print("Mismatch: item named {}'s buildstream_object is a {} in first and a {} in second".format(key, type(value.buildstream_object), type(value2.buildstream_object)))
+                return
+            print("Inspecting {} of type {}".format(key, type(value.pb_object)))
+            if type(value.pb_object) == remote_execution_pb2.DirectoryNode:
+                # It's a directory, follow it
+                self.descend(key).showdiff(other.descend(key))
+            elif type(value.pb_object) == remote_execution_pb2.SymlinkNode:
+                target1 = value.pb_object.target
+                target2 = value2.pb_object.target
+                if target1 != target2:
+                    print("Symlink named {}: targets do not match. {} in the first, {} in the second".format(key, target1, target2))
+            elif type(value.pb_object) == remote_execution_pb2.FileNode:
+                if value.pb_object.digest != value2.pb_object.digest:
+                    print("File named {}: digests do not match. {} in the first, {} in the second".format(key, value.pb_object.digest, value2.pb_object.digest))
+        if len(l2) != 0:
+            print("'Other' is long: it contains extra items called: {}".format(", ".join([i[0] for i in l2])))
+            return
+        print("No differences found in {}".format(self))
+              
+        
+    
     def import_files(self, external_pathspec, *, files=None,
                      report_written=True, update_utimes=False,
                      can_link=False):
@@ -698,6 +739,7 @@ class CasBasedDirectory(Directory):
             self.parent._recalculate_recursing_up(self)
         if duplicate_cas:
             if duplicate_cas.ref.hash != self.ref.hash:
+                self.showdiff(duplicate_cas)
                 raise VirtualDirectoryError("Mismatch between file-imported result {} and cas-to-cas imported result {}.".format(duplicate_cas.ref.hash,self.ref.hash))
 
         return result