gitlogg: Gitlogg is broken for some repositories

gitlogg-parse-json.js is broken.

sh gitlogg-generate-log.sh                   
-e Generating git log for all repositories located at '../travistorrent-tools'. This might take a while!
-e The file ./gitlogg.tmp generated in: 0s
Generating JSON output...
[stdin]:40
  author_name = item[16].replace(/"/g, "'"),
                        ^

TypeError: Cannot read property 'replace' of undefined
    at [stdin]:40:25
    at Array.reduce (native)
    at [stdin]:23:4
    at Object.exports.runInThisContext (vm.js:54:17)
    at Object.<anonymous> ([stdin]-wrapper:6:22)
    at Module._compile (module.js:409:26)
    at node.js:579:27
    at nextTickCallbackWith0Args (node.js:420:9)
    at process._tickCallback (node.js:349:13)

Minimally demonstrative example: Analyse repository TestRoots/travistorrent-tools.

Thanks for looking into this 😃, I liked the way you prepare the git log output in gitlogg-generate-log.sh. Should this problem be related to git log behaving strangely (printing/not prinitng empty lines), I might have a clean fix with awk.

About this issue

  • Original URL
  • State: closed
  • Created 8 years ago
  • Comments: 21 (19 by maintainers)

Most upvoted comments

After breaking down the git log and parsing step by step, I discovered that there was actually a carriage return (^M) character in one of the Git commit messages being parsed, which was adding an extra newline and throwing all subsequent parsing off as a result.

Modifying the parsing in gitlogg-generate-log.sh to include a carriage return deletion allows me to get past the issue.

        git log --all --no-merges --shortstat --reverse --pretty=format:'commits\trepository\t'"${PWD##*/}"'\tcommit_hash\t%H\tcommit_hash_abbreviated\t%h\ttree_hash\t%T\ttree_hash_abbreviated\t%t\tparent_hashes\t%P\tparent_hashes_abbreviated\t%p\tauthor_name\t%an\tauthor_name_mailmap\t%aN\tauthor_email\t%ae\tauthor_email_mailmap\t%aE\tauthor_date\t%ad\tauthor_date_RFC2822\t%aD\tauthor_date_relative\t%ar\tauthor_date_unix_timestamp\t%at\tauthor_date_iso_8601\t%ai\tauthor_date_iso_8601_strict\t%aI\tcommitter_name\t%cn\tcommitter_name_mailmap\t%cN\tcommitter_email\t%ce\tcommitter_email_mailmap\t%cE\tcommitter_date\t%cd\tcommitter_date_RFC2822\t%cD\tcommitter_date_relative\t%cr\tcommitter_date_unix_timestamp\t%ct\tcommitter_date_iso_8601\t%ci\tcommitter_date_iso_8601_strict\t%cI\tref_names\t%d\tref_names_no_wrapping\t%D\tencoding\t%e\tsubject\t%s\tsubject_sanitized\t%f\tcommit_notes\t%N\tstats\t' |
          sed '/^[ \t]*$/d' |               # remove all newlines/line-breaks, including those with empty spaces
+          tr -d '\r' |                      # Delete carriage returns
          tr '\n' 'ò' |                     # convert newlines/line-breaks to a character, so we can manipulate it without much trouble
          tr '\r' 'ò' |                     # convert carriage returns to a character, so we can manipulate it without much trouble
          sed 's/tòcommits/tòòcommits/g' |  # because some commits have no stats, we have to create an extra line-break to make `paste -d ' ' - -` consistent
          tr 'ò' '\n' |                     # bring back all line-breaks
          sed '{
              N
              s/[)]\n\ncommits/)\
          commits/g
          }' |                              # some rogue mystical line-breaks need to go down to their knees and beg for mercy, which they're not getting
          paste -d ' ' - -                  # collapse lines so that the `shortstat` is merged with the rest of the commit data, on a single line