22 Git

22.1 Slides

Workshop: Git and Projects

Date: February 6 2020

Slides and Resources

22.3 Use SSH

Recently, GitHub deprecated basic passwords for authentication so my recommendation is to use SSH throughout. See Happy Git With R’s SSH keys introduction. Then when you grab links to clone, eg. the WEEL’s ewc project, use the format git@gitlab.com:WEEL_grp/ewc.git and not https://gitlab.com/WEEL_grp/ewc.git

22.4 .gitignore

22.4.1 Include one file in an ignored folder

Use a wildcard * to exclude everything in a folder and include the file with !. The wildcard should be specified exactly on the folder, not the larger directory. In the example below, we ignore everything in the data/derived-data folder, not just the data/ folder.

Eg. in file .gitignore:

.Rproj.user
.Rhistory
.RData
.Ruserdata

data/raw-data/
data/derived-data/*
!/data/derived-data/output.Rds

22.5 Tags

We can add git tags to our commits, to keep track of specific versions or steps of a repository.

For example, spatsoc and other packages use git tags to mark specific versions of the package. These versions link to GitHub/GitLab releases, making it easy to install specific versions.

For papers, you can track different stages of revisions:

To add a tag in GitAhead, just right click on the commit, add tag, then fill in the name (no spaces) and check “push to origin”. Then you will be able to find the tags on gitlab for example: wildcam tags.

22.6 Blame and history

Git Blame and Git History can be used to get information about when files were changed and by who.

For example: head over to GitLab.com and browse to one of your files. Open it up and select “history”. You will all the times the file was changed, and selecting any of the commits will show you what it looked like at that point.

Alternatively, you can use select “blame” to show who changed each line in a file and at which commit.

22.7 Issues

Issues are a tool available on GitLab and GitHub for collaborating on projects, getting help and tracking decisions. They are used extensively in the FOSS world and are often the way R package developers opt to receive bug reports and feature requests from their users.

Issues can be used for both personal and group projects. Some uses include:

  • collaborating
  • developing methods
  • troubleshooting code
  • tracking decisions or assumptions
  • defining thresholds, variables
  • reporting bugs
  • requesting new features

Issues have a number of advantages over private email, in-person conversations or TODO, NOTE and other comments embedded in code because they:

  • are easily searched
  • can be grouped or categorized using labels
  • can be assigned to individuals, groups or teams
  • keep everyone in the loop
  • are a resource for future users (including yourself!)
  • allow for asynchronous communication
  • keep track of the history of decisions, progress, etc.

22.7.1 Example issues

22.7.1.1 Bug or troubleshooting

I’m trying to do this and it isn’t working

# Some concise title
(One/two sentence description summarizing issue. )

## Steps to reproduce
(How one can reproduce the issue)
1.
2.
3.

## Observed behaviour

## Expected behaviour

## Sample of output tables, screenshots

## Possible fixes

## Session info
(if R related, paste the results of sessionInfo())

22.7.1.2 Method development

I’d like to try and do this

# Some concise title
(One/two sentence description summarizing the objective. )

## Description
(Include goals, use cases, benefits)

## Progress
(Where did you get?)

## Next steps
(How do you think we could solve it? Who can you assign
  this issue to? Who can you mention to get their input?)

And once solved, post the code that actually solved it. This is one of the main ways that Issues can be useful to future users.

22.8 Tracking different file types

Git won’t be able to track changes in all data (eg. .Rds and other binary file types) and output R Markdown file types.

For R Markdown, use simply:

output:github_document

or keep the .md file along with your main output type. Either by rendering multiple output formats or using:

output:
  pdf_document:
    keep_md:true
output:
  html_document:
    keep_md:true

Generate figures as seperate PNGs and include them in the document using knitr::include_graphics(). This way, the PNGs can be diffed by Git and easily viewed separate from the R Markdown output.

22.9 GitLab CI/DI and GitHub Pages

22.9.1 Testing data

Using tests is a great way to clarify your expectations about data outputs, that be checked whenever code is updated. Check out the testthat package and checkr package.