GitLogParser

Generates visualisation data from git repository logs and repository file list.

This parser specializes in tracking file changes across different file-versions on different feature-branches and then the merged main. Furthermore, special care is taken not to display non-existent files that might show up in normal Git histories due to renaming actions on feature branches.

Things to note:

  • File deletions that get reverted later on are ignored by this parser.

It supports the following metrics per file:

Metric Description
age_in_weeks The file’s age measured in weeks since creation.
number_of_authors The count of distinct authors who have contributed commits.
number_of_commits The total commits made to the file.
number_of_renames How many times the file has been renamed.
range_of_weeks_with_commits The span of weeks during which commits were made.
successive_weeks_with_commits Consecutive weeks during which the file received commits.
weeks_with_commits The number of weeks in which the file was modified.
highly_coupled_files Files often modified together with this file (35% overlap or more).
median_coupled_files The median number of files modified in tandem with this file.

Additionally, the following Edge Metrics are calculated:

Metric Description
temporal_coupling The degree of temporal coupling between two files (>=35%)

The names of authors are saved when the –add-author flag is set.

Scan a local git repository on your machine

Creating required files on the fly | repo-scan

See ccsh gitlogparser repo-scan -h for help. Standard usage:

ccsh gitlogparser repo-scan --repo-path <path>

With the sub command repo-scan you can parse a local git repository on your disk. During scanning a git log of the repository in the current working directory (or from the directory specified by repo-path) is created in your temp-Folder and parsed automatically. Furthermore, the parser creates another temporary file-name-list of files that are tracked by git automatically which is needed for the parsing process.

The result is written as JSON to standard out or into an output file (if specified by -o option).

If a project is piped into the GitLogParser, the results and the piped project are merged. The resulting project has the project name specified for the GitLogParser.

Executing the repo-scan subcommand

  • ccsh gitlogparser repo-scan --repo-path <path_to_my_git_project> -o output.cc.json.gz
  • load output.cc.json.gz in visualization

Manual creation of required files | log-scan

See ccsh gitlogparser log-scan -h for help. Standard usage:

ccsh gitlogparser log-scan --git-log <path> --repo-files <path>

With the sub command log-scan, an existing git log and file name list are used for parsing.

The result is written as JSON to standard out or into an output file (if specified by -o option).

If a project is piped into the GitLogParser, the results and the piped project are merged. The resulting project has the project name specified for the GitLogParser.

Creating the repository log for metric generation

SCM Log format Command for log creation tracks renames ignores deleted files supports code churn
git GIT_LOG_NUMSTAT_RAW_REVERSED git log --numstat --raw --topo-order --reverse -m yes yes yes

You can also use the bash script anongit which generates a git log with anonymized authors for usage with CodeCharta.

Creating the git files list of the repository for metric generation

git ls-files > file-name-list.txt

Please make sure to execute this command in the root folder of your repository.

Executing the log-scan subcommand

  • cd <my_git_project>
  • git log --numstat --raw --topo-order --reverse -m > git.log (or anongit > git.log)
  • git ls-files > file-name-list.txt
  • ccsh gitlogparser log-scan --git-log git.log --repo-files file-name-list.txt -o output.cc.json.gz
  • load output.cc.json.gz in visualization

Updated: