You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@allura.apache.org by jo...@apache.org on 2013/02/12 23:21:49 UTC

[8/11] git commit: [#5685] Converted commit log and last_commit_ids for git back to use SCM library

[#5685] Converted commit log and last_commit_ids for git back to use SCM library

After some more empircal testing, it appears that my earlier
determination that calling out to git to determine the LCD info was too
expensive was due to the previous implementation of commit log not using
a generator and thus walking the entire history every time it was being
used instead of just until the changed paths were identified.

TimerMiddleware metrics for HEAD~ implementation:

    Total Time (ms)    Calls        Method
       48144              1    repo.LastCommit._build

TimerMiddleware metrics for this implementation:

    Total Time (ms)    Calls        Method
        1646              1    repo.LastCommit._build

This really needs to be tested against many repeated calls to
LastCommit._build, since the overhead of repeated calls to the external
git process might add up faster than the (significant) savings.

Signed-off-by: Cory Johns <jo...@geek.net>


Project: http://git-wip-us.apache.org/repos/asf/incubator-allura/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-allura/commit/d2f16cf5
Tree: http://git-wip-us.apache.org/repos/asf/incubator-allura/tree/d2f16cf5
Diff: http://git-wip-us.apache.org/repos/asf/incubator-allura/diff/d2f16cf5

Branch: refs/heads/cj/5685
Commit: d2f16cf5baab247c6be5a71a3b28937d0a68ef2b
Parents: a981317
Author: Cory Johns <jo...@geek.net>
Authored: Tue Feb 12 03:37:07 2013 +0000
Committer: Cory Johns <jo...@geek.net>
Committed: Tue Feb 12 22:21:20 2013 +0000

----------------------------------------------------------------------
 ForgeGit/forgegit/model/git_repo.py |   46 ++++++++++++++---------------
 1 files changed, 22 insertions(+), 24 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/incubator-allura/blob/d2f16cf5/ForgeGit/forgegit/model/git_repo.py
----------------------------------------------------------------------
diff --git a/ForgeGit/forgegit/model/git_repo.py b/ForgeGit/forgegit/model/git_repo.py
index 67a1906..0bcd4c2 100644
--- a/ForgeGit/forgegit/model/git_repo.py
+++ b/ForgeGit/forgegit/model/git_repo.py
@@ -239,35 +239,33 @@ class GitImplementation(M.RepositoryImplementation):
         return doc
 
     def commits(self, path=None, rev=None, skip=None, limit=None):
-        if rev is None:
-            rev = 'HEAD'
-        start = skip or 0
-        stop = start + limit if limit is not None else None
-
-        def _pred(c):
-            '''
-            Work-around for potentially b0rked changed_paths.
-            This could be replaced with lambda c: path in c.changed_paths
-            once all projects have had their DiffInfoDocs refreshed.'''
-            if path in c.changed_paths:
-                return True
-            parent = c.get_parent()
-            if c.has_path(path) and not (parent and parent.has_path(path)):
-                return True  # added in this commit, inspite of changed_paths
-            return False
-        predicate = None
-        if path is not None:
-            path = path.strip('/')
-            predicate = _pred
-
-        iter_tree = self.commit(rev).climb_commit_tree(predicate)
-        for commit in itertools.islice(iter_tree, start, stop):
-            yield commit._id
+        params = dict(paths=path)
+        if rev is not None:
+            params['rev'] = rev
+        if skip is not None:
+            params['skip'] = skip
+        if limit is not None:
+            params['max_count'] = limit
+        return (c.hexsha for c in self._git.iter_commits(**params))
 
     def commits_count(self, path=None, rev=None):
         commit = self._git.commit(rev)
         return commit.count(path)
 
+    def last_commit_ids(self, commit, paths):
+        cache = getattr(c, 'model_cache', '') or M.repo.ModelCache()
+        tree_path = os.path.commonprefix(paths).strip('/')
+        paths = set(paths)
+        result = {}
+        for commit_id in self.commits(path=tree_path, rev=commit._id):
+            commit = cache.get(M.repo.Commit, dict(_id=commit_id))
+            changed = paths & set(commit.changed_paths)
+            result.update({path: commit_id for path in changed})
+            paths = paths - changed
+            if not paths:
+                break
+        return result
+
     def log(self, object_id, skip, count):
         obj = self._git.commit(object_id)
         candidates = [ obj ]