Version control & collaboration
On this page, you can follow my progress in learning about tools for version control, reproducible workflow and collaboration
Introduction
Software developers use version control systems for a long time. If the principle can be extended for other file types, such as LaTeX, documents, images and text files in .CSV file format (comma separated values), these systems would be useful for many other use cases such as medicine, good laboratory practice or research. Similarly, software for reproducible workflows emerging which makes documentation easy and allows automation of data analysis and reporting. The following software is promising: Stitch (an ETL service by Stitch Inc. that loads data from GitLab and MySQL and allows for data analysis using R and Python) or the Invantive Query Tool and the Invantive Control for Excel and the Invantive Control for Word (loads data from GitLab using SQL and connects to Microsoft Word, Excel or allows for data analysis using SQL). In addition, the KNIME platform (an Eclipse rich client platform (RCP)) allows to create reproducible workflows that include i.e. Python or R scripts or ImageJ2 KNIME nodes that interact with data (e.g. with databases or text files in .CSV format (comma separated values / Excel sheets / google sheets) or automatically generate reproducible reports (see knitr/RMarkdown, knitpy) and the KNIME workflows can be put under revision- / version control by GitHub/GitLab or by the proprietary KNIME Server. You can install the KNIME Development SDK from Bio7 / Eclipse with the Eclipse-plugin Egit. Just follow the instructions and make sure you wait long enough until everything is loaded.
GitLab.com provides unlimited private or public repositories for free. Also, repositories can be part of a group and gitlab.com includes a great web IDE. In addition, issues can be marked as confidential even in the free version. Microsoft’s github.com provides fewer integrated DevOps services but is more mature, especially when it comes to search engine optimization.
Using GitLab with Bio7 / Eclipse and Egit
Gitlab.com is identical to Github.com in terms of interaction with Egit (except that you create groups, subgroups and projects and not just simple repositories).
First time setup
-
On gitlab.com, create an account and create a new online repository on gitlab.com.
-
Confirm that the Eclipse plug-in Egit together with Gitflow components is installed or install those.
-
Follow the user guide. The following steps provide additional information on how to proceed. Make sure, that the environment variable HOME is set to Users/<UserProfile>
-
In Bio7 / Eclipse, left-click Preferences > Team > Git > Configuration > User Settings > Add Entry. In the field “key”, type “user.email” (without “”) and in the field “value” enter your GitLab login email address. Add another entry and in the field “key”, type “user.name” (without “”). In the field “value”, enter your Github username. Left-click Apply.
-
Left-click Preferences > SSH2 > key management > Generate RSA key. Save the private key, note the password you enter and copy-paste the public key and save it.
-
Log in to your Gitlab account. Navigate to the “SSH Keys” tab in your “Profile Settings”. Paste your key in the “Key” section and give it a relevant “Title”. Use an identifiable title like “Work Laptop - Windows 7” or “Home MacBook Pro 15”.
-
To enable connection to private repositories via HTTPS, run “Credential Manager” in Microsoft Windows and left-click “Windows credentials”, verify that credentials for the following URLs:
git:https://<username>@gitlab.com
andgit:https://gitlab.com
, e.g.git:https://DerAndere@gitlab.com
andgit:https://DerAndere@gitlab.com
are added. Missing credentials can be added by left-clicking on “Add a gerneric credential”
Repository setup
-
In your GitLab account, create a new group (e.g. GitLabGroup1ByAuthor) and, if you like, a subgroup (e.g. Group1Subgroup1ByAuthor). Then create a new blank online project (call it e.g. Project1) and initialize it by Initialize repository with a README (see guide).
-
To create an additional remote branch for testing, open the GitLab project in your GitLab account that was created in step 7 and left-click “Create new…” > New branch. Type the name of the branch, e.g. “remote_mastertest”. Set “Create from” to “master” (= keep the default setting) and left-click “Create branch”. Then, similarly create a new branch “remote_dev2” from master. Finally create a new branch “remote_dev2_test” from remote_dev2.
-
In Bio7 / Eclipse, open the Git perspective and Left-click “Clone repository”. Alternatively, in any perspective, left-click File > Import > Git > Project from Git > Next > Clone URI > Next. In the Wizard, paste the URI of the online GitLab repository. Under target directory, unselect default and specify a folder under Users/<UserProfile>. Normally this is Users/<UserProfile>/git/. As protocol, select SSH and let the wizard set all settings automatically. Left-click “Finish”, enter the password that protects the SSH key (see step 5) and left-click OK.
-
Open the Git perspective. Right-click on the local cloned repository created in step 8 and in the context-menu left-click “Import project”. Select the repository created in step 8 and left-click Next. If the repository contains no project folder, select “General Project” > Finish.
-
Open the Resources perspective of Bio7 / Eclipse and in the Project explorer view right-click on the project that was added in step 10. In the context menu, left-click Team > Switch To > New Branch > Select… > remote tracking > origin/master > OK. Type in the branch name, e.g. “local_test”. Repeat for a branch called e.g. “local_feature1”.
-
In the Project Explorer view, right-click on the project and in the context-menu left-click Team > Switch To > feature1_local.
-
In the Project Explorer view, right-click on the root directory and in the context-menu left-click New > folder and call it e.g. src. For Arduino Sketches, it is recommended to add another folder with the name of the program inside the folder src/. Then right-click on the folder and in the context-menu left-click New > Project > Arduino > Arduino Sketch. In the Wizard, under target directory, unselect default and left-click “browse”. Navigate to the directory of the local clone of the GitLab repository that was created in step 8 (e.g. Users/<UserProfile>/git/Project1). Set the target directory to non-default Select the project with the name of the repository created in steps 8 and 9. Name the project the same as the previously created directory. For other projects, the nature of a directory can be changed later.
-
After the project was created successfully, continue with development in your local environment.
Local development:
-
Right-click on the project in the Resource perspective. In the context-menu left-click Team > Switch To > local_test. Right-click on the project in the Resource perspective. In the context-menu left-click Team > Pull.
-
Repeat step 15 for local master and finally for local_feature1
-
While being checked in to the local_test branch, in the Project Explorer view, add folders and files to the project and optionally modify the content of those files in the Editor view. Perform formatting, code analysis, compilation and tests. Left-click File > Save All.
Development with Egit:
-
While being checked in to the local_test branch, Right-click on the project in the Resource perspective. In the context-menu left-click Team > Merge… > local > local_test > OK > OK. Then right-click on the project and in the context-menu left-click Team > Switch To > local_test. Perform code analysis, compilation and tests. If everything works as expected, proceed.
-
Switch to the branch local_feature1 and right-click the project and in the context-menu left-click Team > Commit…. In the Git staging view of Gitflow that appears, select all files that were changed with a common goal and left-click “Add selected file to index” to stage these files. Then enter a commit message specifying the goal of the changes. Refer to the Egit guide, section “Working with Gitflow” for details. Finally, left-click “Commit”.
-
Right-click on the project and in the context-menu left-click Team > Push branch local_feature1 … In branch, delete the default entry master and type m for remote master, or r for remote branches starting with r. double-click on one of the remote test-branches. Then left-click Preview > Push. Test the remote test branch. If the remote test branch behaves as expected, proceed.
-
Repeat step 18 but select master (= remote master) or remote_dev as target branch.
-
Right-click on the project in the Resource perspective. In the context-menu left-click Team > Switch To > master
-
Go back to step 15, or close Bio7 / Eclipse.
The strategy used for short-lived feature branches is, before submitting
a pull request, to rebase interactively on a previous commit and edit
the commits to the feature branch, then checkout master, fetch changes
from upstream and rebase (pull –rebase) master on upstream master, then
create a m_feaure1_PR branch based off of master, test, then update
master and rebase on master, then create a pull request, then before
merging: rebase the PR branch on updated master, then merge the PR
branch into upstream master. For long-lived feature branches, rebasing
and merging can become tedious. Repeatedly rebase onto updated master as
long as the feature branch is local. Once a branch is pushed for
collaborative work or for opening a pull request or for use, changes
from upstream can be incorporated by merging updated master into the
feature branch. This creates an unclean history but is fail-save. If a
long-lived feature branch was not updated for a long time, rebasing or
merging can become difficult and/or result in an untidy history. You
can clean the history by creating a new branch based on upstream/master
and then extract all changes between the original feature branch and
master using either git diff sha1 sha3
> diff
or git log
or git reflog
(short for git log -g
). Apply those changes using git apply
diff
.
Alternatively, use git cherry-pick sha1 sha3
. If you have no merge
commits in the feature branch and do not want to identify the first and
last commit ID (sha1) manually, use the following
syntax:
git cherry-pick $(git log devel..B --pretty=format:"%h" | tail
-1)^..$(git log B -n 1 --pretty=format:"%h")
For brevity, git rebase -m
invokes git cherry-pick
repeatedly for each commit passed to the git rebase -m
command.
However it collapses history on merge commits, so you have to
cherry-pick
manually in case you have merge commits in the feature
branch.
It can be easier to squash all commits in the feature branch into one
before merge
ing/cherry-pick
ing the changes into master. While an interactive rebase with git rebase -i sha1
offers the biggest flexibility as it allows to reorder commits, mark them as fixup
or squash
them, the rebase can be labour intensive. In addition reordering commits may not be possible if later commits depend on earlier changes. If the last commits were all authored by you and not pulled by others, you can use git reset --soft sha1
to reset the working branch to the sha1 commit and keep all changes after that commit in the staging area. Then use git commit
again.
Daira Hopwood
posted a more advanced solution on
stackoverflow.com
When a new software version is ready for public testing, add a tag to a specified commit and push.
At the web-host’s web interface for the online-repository, select the previously created tag and add a release note to create a release.
Using Git with Microsoft Visual Studio Code and GitLens
Instead of Egit, one can use Microsoft’s open source IDE “Visual Studio Code” (VSCode) together with “Git for Windows”.
Installation
Detailed instructions can be found here. The installer for “Git for Windows” offers the option to use VSCode as the default editor. The global editor used with git can also be set via environment variables in “system path”. This setting can be overwritten via the git command line: In PowerShell, execute
git config --global core.editor code --wait
To enable connection to private repositories via HTTPS,
run “Credential Manager” in Microsoft Windows and left-click “Windows credentials”, verify that credentials for the following URLs: git:https://<username>@gitlab.com
and git:https://gitlab.com
, e.g. git:https://DerAndere@gitlab.com
and git:https://DerAndere@gitlab.com
are added. Missing credentials can be added by left-clicking on “Add a gerneric credential”.
After Git and VSCode are installed, it is recommended to install some extensions: “GitLens” for better git integration and “PlatformIO IDE” for embedded system development. After everything is updated, restart VSCode.
Setting up git repositories
If you want to contribute to an existing project, fork the original repository (called “upstream”) at github.com or gitlab.com, wherever the original project is hosted. The remote repository of your fork is referred to as “origin”. If you want to start a new project that does not contain any code from an existing project, create a new online repository at gitlab.com. In VSCode, go to “PlatformIO” > “Quick Access” > “Clone project” and specify the URL to the repository, then press enter.
Creating a new branch
Alternative A)
- Select a base branch
- “Source Control” > “Source Control”. In the section “Source control” of the SOURCE CONTROL panel, left-click on the branch name next to the repository name.
- In the status bar, left-click on the current branch name and select the desired branch from the list that appears.
- “Create new branch from…”. Follow the steps on screen, checkout the new branch
- Left-click “Publish” next to the branch name
Alternative B)
- “Source Control” > “Source Control”. In the section “Source control” of the SOURCE CONTROL panel, left-click on the “…” menu next to the repository name > Branch > Create new branch from…
Pull changes from remote repositories
Pulling changes from origin and rebasing (git pull --rebase
) is needed, if you collaborate with others on the same branch, before editing and before pushing commits to the branch in the remote repository.
Pulling changes from all remotes (e.g. origin and upstream) (git remote add <repo-url> <remotename> && git fetch -all && git rebase <remotename>/<branch>
) is needed, if you commited changes and want to update the branch that is currently checked out (rebase onto current upstream master), e.g. before a pull request can be merged.
If you want to pull changes from origin only, you can skip the following two steps. If you want to pull changes from other remotes (e.g. upstream) you have to make sure that it is configured as a remote of your repository. The following two steps have to be done only once per repository:
- left-click on “Source Control” > “Source control”. In the section “Source control” of the SOURCE CONTROL panel, left-click on the “…” menu next to the repository name > “Remote” > “Add Remote…”
- Specify the URL of the remote repository, then a name (e.g. “upstream”)
When the repository that contains the branch you want to rebase onto from is added as a remote:
- Switch to the branch for which you want to pull changes (usually origin/master first, then the pull request branch) by doing one of the following:
- “Source Control” > “Source Control”. In the section “Source control” of the SOURCE CONTROL panel, left-click on the branch name next to the repository name.
- In the status bar, left-click on the branch name
- “Source control” > “Source control”. In the section “Source control” of the SOURCE CONTROL panel, left-click on the “…” menu next to the repository name > Checkout to…
- Select the desired branch
- “Source Control” > “Source Control”. In the section “Source control” of the SOURCE CONTROL panel, left-click on the “…” menu next to the repository name > Pull, Push
- To pull changes from origin and rebase (
git pull --rebase
), select “Pull (Rebase)” - To pull changes from all remotes (e.g. origin and upstream):
- Select “Fetch from all remotes”
- “Source Control” > “Source Control”. In the section “Source control” of the SOURCE CONTROL panel, left-click on the “…” menu next to the repository name > “Branch” > “Merge Branch…”
- Select the base branch to merge the currently checked-out branch, e.g. “upstream/master”. If conflicts arise, see section “Interactive Rebase onto branches from other remote repositories”
- To pull changes from origin and rebase (
Commit changes
- Switch to the branch where you want to add changes
- Edit files
- Save files
- Close fles
- “Source Control” > “Source control”. In the section “Source control” of the SOURCE CONTROL panel, expand the “Changes” menu that is below the desired repository.
- Stage all changes by doing one of the following:
- left-click on the “+” button next to the changed file name to stage shat file (add changes to the git index)
- left-click on the “+” button next to the menu heading “Changed” to stage all changes.
- left-click on “Source Control” > “Source control”. In the section “Source control” of the SOURCE CONTROL panel, left-click on the “…” menu next to the repository name > “Changes” > Stage all changes
- Enter a commit message. Possibly add a “Co-authored-by: Name <email address>” tag for each author
- Left-click “Commit”
Interactive Rebase of checked-out branch
This action (git rebase -i <sha>
) is needed if you commited changes and want to edit or squash one or more consecutive commits. Note: Force-pushing rebased branches overwrites the rebased commits This may cause problems for collaborators on the same branch. Collaborators may have to reset their local repository as described in section “Reset repository”. With VSCode and Gitlens:
- “Source control” > “Commits”. Right-click on the desired base commit. From the context menu, select “Rebase Current branch onto commit…” > “Rebase interactively”
- Select “pick”, “reword”, “edit”, “fixup” or “squash” for each commit
- At the bottom of the git-rebase-todo GitLens Interactive Rebase dialogue, left-click “START REBASE”
- “Source Control” > “Source control” > In the section “Source control” of the SOURCE CONTROL panel, below the avtive repository name, expand the subsection “Merge changes”. Here, the files which contain conflicts are listed. In this list, left-click on the next file.
- Edit the file to resolve the conflict. You can use the buttons “Accept Incoming Change” or “Accept Current Change”.
- Save the edited file
- Close the edited file
- After all conflicts are resolved, stage all changes by doing one of the following:
- left-click on the “+” button next to the changed file name to stage shat file (add changes to the git index)
- left-click on the “+” button next to the menu heading “Changed” to stage all changes.
- left-click on “Source Control” > “Source control”. In the section “Source control” of the SOURCE CONTROL panel, left-click on the “…” menu next to the repository name > “Changes” > Stage all changes
- Continue the rebase by doing one of the following:
- At the bottom of the git-rebase-todo GitLens Interactive Rebase dialogue, left-click “Continue Rebase”
- In the Terminal within VSCode that runs PowerShell, change directory to the current project using the command
cd
. Then execute the following command:git rebase --continue
Add selected changes to your branch using git cherry-pick
The action git fetch <repo-url> <branch> && git cherry-pick <sha>
is needed, if you want to add individual commits from a different branch. If you want to cherry-pick commits from a different remote repository, you have to make sure that it is configured as a remote of your repository, because VSCode has no GUI element to execute git fetch <repo-url> <branch>
directly:
- left-click on “Source Control” > “Source control”. In the section “Source control” of the SOURCE CONTROL panel, left-click on the “…” menu next to the repository name > “Remote” > “Add Remote…”
- Specify the URL of the remote repository
When the repository you want to pick a commit from is added as a remote:
- Switch to the branch to which you want to add the commits by doing one of the following:
- “Source Control” > “Source Control”. In the section “Source control” of the SOURCE CONTROL panel, left-click on the branch name next to the repository name.
- In the status bar, left-click on the branch name
- “Source control” > “Source control”. In the section “Source control” of the SOURCE CONTROL panel, left-click on the “…” menu next to the repository name > “Checkout to…”
- Select the desired branch
- “Source Control” > “Source Control”. In the section “Source control” of the SOURCE CONTROL panel, left-click on the “…” menu next to the repository name > “Pull, Push” > “Fetch from all remotes”.
- “Source Control” > “Branches”. In the section “Branches” of the SOURCE CONTROL panel, expand the the branch that contains the commit you want to pick by left-clicking on the branch name. Right-click on the commit that you want to pick and from the context-menu, select “Cherry Pick Commit…” (or Rabase Current Branch onto commit)
- “Source Control” > “Source control” > In the section “Source control” of the SOURCE CONTROL panel, below the avtive repository name, expand the subsection “Merge changes”. Here, the files which contain conflicts are listed. In this list, left-click on the next file.
- Edit the file to resolve the conflict. You can use the buttons “Accept Incoming Change” or “Accept Current Change”.
- Save the edited file
- Close the edited file
- After all conflicts are resolved, stage all changes by doing one of the following:
- left-click on the “+” button next to the changed file name to stage shat file (add changes to the git index)
- left-click on the “+” button next to the menu heading “Changed” to stage all changes.
- left-click on “Source Control” > “Source control”. In the section “Source control” of the SOURCE CONTROL panel, left-click on the “…” menu next to the repository name > “Changes” > “Stage all changes”
- Continue the rebase by doing one of the following:
- In the Terminal within VSCode that runs PowerShell, change directory to the current project using the command
cd
. Then execute the following command:git cherry-pick --continue
- In the Terminal within VSCode that runs PowerShell, change directory to the current project using the command
Rebase current branch onto a branch from a different repository
This rarely needed action (git remote add <repo-url> <remotename> && git fetch <remotename> <basebranch> && git rebase <remotename>/<basebranch>
) can be used if you want to base the current branch (the branch that is currently checked out) off of a basebranch from a repository that is different from upstream and that is available at the repo-url). The following two steps have to be done only once per repository:
- left-click on “Source Control” > “Source control”. In the section “Source control” of the SOURCE CONTROL panel, left-click on the “…” menu next to the repository name > “Remote” > “Add Remote…”
- Specify the URL of the remote repository
When the repository that contains the base branch is added as a remote:
- Switch to the branch you want to rebase by doing one of the following:
- “Source Control” > “Source Control”. In the section “Source control” of the SOURCE CONTROL panel, left-click on the branch name next to the repository name.
- In the status bar, left-click on the branch name
- “Source control” > “Source control”. In the section “Source control” of the SOURCE CONTROL panel, left-click on the “…” menu next to the repository name > “Checkout to…”
- Select the desired branch
- “Source Control” > “Source Control”. In the section “Source control” of the SOURCE CONTROL panel, left-click on the “…” menu next to the repository name > “Pull, Push” > “Fetch from all remotes”
- left-click on “Source Control” > “Source control”. In the section “Source control” of the SOURCE CONTROL panel, left-click on the “…” menu next to the repository name > “Branch” > “Rebase” > “Rebase Branch”
- “Source Control” > “Source control” > In the section “Source control” of the SOURCE CONTROL panel, below the avtive repository name, expand the subsection “Merge changes”. Here, the files which contain conflicts are listed. In this list, left-click on the next file.
- Edit the file to resolve the conflict. You can use the buttons “Accept Incoming Change” or “Accept Current Change”.
- Save the edited file
- Close the edited file
- After all conflicts are resolved, stage all changes by doing one of the following:
- left-click on the “+” button next to the changed file name to stage shat file (add changes to the git index)
- left-click on the “+” button next to the menu heading “Changed” to stage all changes.
- left-click on “Source Control” > “Source control”. In the section “Source control” of the SOURCE CONTROL panel, left-click on the “…” menu next to the repository name > “Changes” > “Stage all changes”
- Continue the rebase by doing one of the following:
- In the Terminal within VSCode that runs PowerShell, change directory to the current project using the command
cd
. Then execute the following command:git rebase --continue
- In the Terminal within VSCode that runs PowerShell, change directory to the current project using the command
Push Changes
“Source Control” > “Source Control”. In the section “Source control” of the SOURCE CONTROL panel, left-click on the “…” menu next to the repository name > “Pull, Push” > “Push to…”
If you have pushed commits and did a rebase afterwards, you have to force push after the rebase. Note: this overwrites the rebased commits and may cause problems for collaborators. Collaborators may have to reset their repository as described in the section “Reset repository”:
“Source Control” > “Source Control”. In the section “Source control” of the SOURCE CONTROL panel, left-click on the “…” menu next to the repository name > “Pull, Push” > “Push to… (Force)”
Copyright 2018 - 2022 DerAndere