nodegit: fileHistoryWalk does not return all commits
When I compare the history returned by fileHistoryWalk to git log console output, there are a lot of commits missing from the fileHistoryWalk commits.
For example, using the atom editor repo and source:
cd /tmp
git clone https://github.com/atom/atom.git
cd atom
git log src/text-editor.coffee
…then, using a slightly modified examples/walk-history-for-file.js (see below) and compare. The output from the example says there are 129 commits to text-editor.coffee but git log src/text-editor.coffee shows 481 commits to that file.
modified example (changed repo, test tile path and added count output:
var nodegit = require("../"),
path = require("path"),
historyFile = "src/text-editor.coffee",
walker,
historyCommits = [],
commit,
repo;
// This code walks the history of the master branch and prints results
// that look very similar to calling `git log` from the command line
function compileHistory(resultingArrayOfCommits) {
var lastSha;
if (historyCommits.length > 0) {
lastSha = historyCommits[historyCommits.length - 1].commit.sha();
if (
resultingArrayOfCommits.length == 1 &&
resultingArrayOfCommits[0].commit.sha() == lastSha
) {
return;
}
}
resultingArrayOfCommits.forEach(function(entry) {
historyCommits.push(entry);
});
lastSha = historyCommits[historyCommits.length - 1].commit.sha();
walker = repo.createRevWalk();
walker.push(lastSha);
walker.sorting(nodegit.Revwalk.SORT.TIME);
return walker.fileHistoryWalk(historyFile, 500)
.then(compileHistory);
}
//nodegit.Repository.open(path.resolve(__dirname, "../.git"))
nodegit.Repository.open("/tmp/atom/.git")
.then(function(r) {
repo = r;
return repo.getMasterCommit();
})
.then(function(firstCommitOnMaster){
// History returns an event.
walker = repo.createRevWalk();
walker.push(firstCommitOnMaster.sha());
walker.sorting(nodegit.Revwalk.SORT.Time);
return walker.fileHistoryWalk(historyFile, 500);
})
.then(compileHistory)
.then(function() {
historyCommits.forEach(function(entry) {
commit = entry.commit;
console.log("commit " + commit.sha());
console.log("Author:", commit.author().name() +
" <" + commit.author().email() + ">");
console.log("Date:", commit.date());
console.log("\n " + commit.message());
});
console.log("\n\n" + historyCommits.length + " total commits");
})
.done();
About this issue
- Original URL
- State: closed
- Created 8 years ago
- Reactions: 4
- Comments: 15 (6 by maintainers)
fileHistoryWalk(fileName, 500)
seems to behave like just a regulargetCommits(500)
that filters by commits that have thefileName
, instead of returning (up to) 500 entries that involve thefileName
.revwalk.fileHistoryWalk(fileName, 1)
, instead of returning the nearest commit that affectsfileName
, will always result in[]
unless the file was modified in the commit that was pushed in therevwalk
.This makes
fileHistoryWalk
fairly useless IMO 😦The inability to interrupt the walk in
revWalk#walk
and/orCommit#history()#start
also makes it very hard to only look at the most recent commits of a specific file in very large repositories, but that is a different issue.I’m puzzled how difficult it is to do
git log -- path/to/file
with this library.As mentioned
fileHistoryWalk
is completely useless as it doesn’t seem to go through all commits in the history, no matter how large number you give it.nodegit@0.27.0
Also for the record, fileHistoryWalk still does not return all commits; it should now have a key on the array return
reachedEndOfHistory
. The intent is that you can request 30k to 50k commits at a time until you hit the end of history (init commit). It seems that the libgit2 revwalk might be a little slower than git core, as just the revwalk alone on the linux repository takes a considerable amount of time in comparsion to standard git core. So it is advisable in large applications to design for that slow down accordingly.