You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@allura.apache.org by jo...@apache.org on 2013/02/12 04:45:25 UTC

git commit: [#5685] Converted commit log and last_commit_ids for git back to use SCM library

Updated Branches:
  refs/heads/cj/5685 fe0233a61 -> 4473bded2


[#5685] Converted commit log and last_commit_ids for git back to use SCM library

After some more empircal testing, it appears that my earlier
determination that calling out to git to determine the LCD info was too
expensive was due to the previous implementation of commit log not using
a generator and thus walking the entire history every time it was being
used instead of just until the changed paths were identified.

TimerMiddleware metrics for HEAD~ implementation:

    Total Time (ms)    Calls        Method
       48144              1    repo.LastCommit._build

TimerMiddleware metrics for this implementation:

    Total Time (ms)    Calls        Method
        1646              1    repo.LastCommit._build

This really needs to be tested against many repeated calls to
LastCommit._build, since the overhead of repeated calls to the external
git process might add up faster than the (significant) savings.

Signed-off-by: Cory Johns <jo...@geek.net>


Project: http://git-wip-us.apache.org/repos/asf/incubator-allura/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-allura/commit/4473bded
Tree: http://git-wip-us.apache.org/repos/asf/incubator-allura/tree/4473bded
Diff: http://git-wip-us.apache.org/repos/asf/incubator-allura/diff/4473bded

Branch: refs/heads/cj/5685
Commit: 4473bded20ddff374c32255f786ee4ef160fd1f0
Parents: fe0233a
Author: Cory Johns <jo...@geek.net>
Authored: Tue Feb 12 03:37:07 2013 +0000
Committer: Cory Johns <jo...@geek.net>
Committed: Tue Feb 12 03:37:10 2013 +0000

----------------------------------------------------------------------
 ForgeGit/forgegit/model/git_repo.py |   46 ++++++++++++++---------------
 1 files changed, 22 insertions(+), 24 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/incubator-allura/blob/4473bded/ForgeGit/forgegit/model/git_repo.py
----------------------------------------------------------------------
diff --git a/ForgeGit/forgegit/model/git_repo.py b/ForgeGit/forgegit/model/git_repo.py
index 67a1906..0bcd4c2 100644
--- a/ForgeGit/forgegit/model/git_repo.py
+++ b/ForgeGit/forgegit/model/git_repo.py
@@ -239,35 +239,33 @@ class GitImplementation(M.RepositoryImplementation):
         return doc
 
     def commits(self, path=None, rev=None, skip=None, limit=None):
-        if rev is None:
-            rev = 'HEAD'
-        start = skip or 0
-        stop = start + limit if limit is not None else None
-
-        def _pred(c):
-            '''
-            Work-around for potentially b0rked changed_paths.
-            This could be replaced with lambda c: path in c.changed_paths
-            once all projects have had their DiffInfoDocs refreshed.'''
-            if path in c.changed_paths:
-                return True
-            parent = c.get_parent()
-            if c.has_path(path) and not (parent and parent.has_path(path)):
-                return True  # added in this commit, inspite of changed_paths
-            return False
-        predicate = None
-        if path is not None:
-            path = path.strip('/')
-            predicate = _pred
-
-        iter_tree = self.commit(rev).climb_commit_tree(predicate)
-        for commit in itertools.islice(iter_tree, start, stop):
-            yield commit._id
+        params = dict(paths=path)
+        if rev is not None:
+            params['rev'] = rev
+        if skip is not None:
+            params['skip'] = skip
+        if limit is not None:
+            params['max_count'] = limit
+        return (c.hexsha for c in self._git.iter_commits(**params))
 
     def commits_count(self, path=None, rev=None):
         commit = self._git.commit(rev)
         return commit.count(path)
 
+    def last_commit_ids(self, commit, paths):
+        cache = getattr(c, 'model_cache', '') or M.repo.ModelCache()
+        tree_path = os.path.commonprefix(paths).strip('/')
+        paths = set(paths)
+        result = {}
+        for commit_id in self.commits(path=tree_path, rev=commit._id):
+            commit = cache.get(M.repo.Commit, dict(_id=commit_id))
+            changed = paths & set(commit.changed_paths)
+            result.update({path: commit_id for path in changed})
+            paths = paths - changed
+            if not paths:
+                break
+        return result
+
     def log(self, object_id, skip, count):
         obj = self._git.commit(object_id)
         candidates = [ obj ]