Hide irrelevant data in your PRs
TL;DR Today I learned that you can hide irrelevant changes from Github PRs. The syntax for .gitattributes
can be tricky though.
In my current project, it’s sometimes necessary to re-generate and commit a bunch of files to the repository. This can happen when you store some IaC or test data in the repository.
Sample repository
To demonstrate the problem, here’s an example repository: https://github.com/majk-p/nice-pr-diffs that features two PRs:
The first one is hard to read https://github.com/majk-p/nice-pr-diffs/pull/1/files because it features a wall of text - new testing data that has been added.
In the second one https://github.com/majk-p/nice-pr-diffs/pull/2/files the change is the same, but I’ve added .gitattributes
file with following content:
test-data/** linguist-generated=true
Thanks to that, file content is not rendered with an option to load diff on demand.
Tricky .gitattributes
syntax
Although Github documentation states that
A .gitattributes file uses the same rules for matching as .gitignore files
Adding something like test-data linguist-generated=true
does not exclude large files within it. Referencing the directory is not enough and you need to be specific - explicitly state the rule applies to all files recursively in the directory.
Summary
Next time you want to make your diff look nice and clean, just create .gitattributes
with following content, replacing my-unreadable-data
with the directory of your large files.
my-unreadable-data/** linguist-generated=true
If you’re using Gitlab, it seems similar thing is achievable with -diff
instead of linguist-generated=true
according to this SO answer but I haven’t tested that.