diff --git a/post/2012-05-22-git-horror-story.md b/post/2012-05-22-git-horror-story.md new file mode 100644 index 0000000..871dd14 --- /dev/null +++ b/post/2012-05-22-git-horror-story.md @@ -0,0 +1,1316 @@ +# A Git Horror Story: Repository Integrity With Signed Commits + +_(Note: This article was written at the end of 2012 and is out of date. I +will update it at some point, but until then, please keep that in +perspective.)_ + +It's 2:00 AM. The house is quiet, the kid is in bed and your significant other +has long since fallen asleep on the couch waiting for you, the light of the TV +flashing out of the corner of your eye. Your mind and body are exhausted. +Satisfied with your progress for the night, you commit the code you've been +hacking for hours: `"[master 2e4fd96] Fixed security vulnerability CVE-123"`. +You push your changes to your host so that others can view and comment on your +progress before tomorrow's critical release, suspend your PC and struggle to +wake your significant other to get him/her in bed. You turn off the lights, trip +over a toy on your way to the bedroom and sigh as you realize you're going to +have to make a bottle for the child who just heard his/her favorite toy jingle. + +Fast forward four sleep-deprived hours. You are woken to the sound of your phone +vibrating incessantly. You smack it a few times, thinking it's your alarm clock, +then fumble half-blind as you try to to dig it out from under the bed after you +knock it off the nightstand. (Oops, you just woke the kid up again.) You pick up +the phone and are greeted by a frantic colleague. "I merged in our changes. We +need to tag and get this fix out there." Ah, damnit. You wake up your +significant other, asking him/her to deal with the crying child (yeah, that went +well) and stumble off to your PC, failing your first attempt to enter your +password. You rub your eyes and pull the changes. + +Still squinting, you glance at the flood of changes presented to you. Your +child is screaming in the background, not amused by your partner's feeble +attempts to console him/her. `git log --pretty=short`...everything looks +good---just a bunch of commits from you and your colleague that were merged in. +You run the test suite---everything passes. Looks like you're ready to go. `git +tag -s 1.2.3 -m 'Various bugfixes, including critical CVE-123' && git push +--tags`. After struggling to enter the password to your private key, slowly +standing up from your chair as you type, you run off to help with the baby +(damnit, where do they keep the source code for these things). Your CI system +will handle the rest. + +Fast forward two months. + +CVE-123 has long been fixed and successfully deployed. However, you receive an +angry call from your colleague. It seems that one of your most prominent users +has had a massive security breach. After researching the problem, your colleague +found that, according to the history, _the breach exploited a back door that you +created!_ What? You would never do such a thing. To make matters worse, `1.2.3` +was signed off by you, using your GPG key---you affirmed that this tag was +good and ready to go. "3-b-c-4-2-b, asshole", scorns your colleague. "Thanks +a lot." + +No---that doesn't make sense. You quickly check the history. `git log --patch +3bc42b`. "Added missing docblocks for X, Y and Z." You form a puzzled +expression, raising your hands from the keyboard slightly before tapping the +space bar a few times with few expectations. Sure enough, in with a few minor +docblock changes, there was one very inconspicuous line change that added the +back door to the authentication system. The commit message is fairly clear and +does not raise any red flags---why would you check it? Furthermore, the +author of the commit _was indeed you!_ + +Thoughts race through your mind. How could this have happened? That commit has +your name, but you do not recall ever having made those changes. Furthermore, +you would have never made that line change; it simply does not make sense. Did +your colleague frame you by committing as you? Was your colleague's system +compromised? Was your _host_ compromised? It couldn't have been your local +repository; that commit was clearly part of the merge and did not exist in your +local repository until your pull on that morning two months ago. + +Regardless of what happened, one thing is horrifically clear: right now, you are +the one being blamed. + + + +## Who Do You Trust? {#trust} + +Theorize all you want---it's possible that you may never fully understand what +resulted in the compromise of your repository. The above story is purely +hypothetical, but entirely within the realm of possibility. How can you rest +assured that your repository is safe for not only those who would reference or +clone it, but also those who may download, for example, tarballs that are +created from it? + +Git is a [distributed revision control +system](https://en.wikipedia.org/wiki/Distributed_revision_control). In +short, this means that anyone can have a copy of your repository to work on +offline, in private. They may commit to their own repository and users may +push/pull from each other. A central repository is unnecessary for +distributed revision control systems, but [may be used to provide an +"official" hub that others can work on and clone +from](http://lwn.net/Articles/246381/). Consequently, this also means that a +repository floating around for project X may contain malicious code; just +because someone else hands you a repository for your project doesn't mean +that you should actually use it. + +The question is not "Who _can_ you trust?"; the question is "Who _do_ you +trust?", or rather---who _are_ you trusting with your repository, right now, +even if you do not realize it? For most projects, including the story above, +there are a number of individuals or organizations that you may have +inadvertently placed your trust in without fully considering the ramifications +of such a decision: + +Git Host +: Git hosting providers are probably the most easily overlooked + trustees---providers like Gitorious, GitHub, Bitbucket, SourceForge, Google + Code, etc. Each provides hosting for your repository and "secures" it by + allowing only you, or other authorized users, to push to it, often with the + use of SSH keys tied to an account. By using a host as the primary holder of + your repository---the repository from which most clone and push to---you are + entrusting them with the entirety of your project; you are stating, "Yes, I + trust that my source code is safe with you and will not be tampered with". + This is a dangerous assumption. Do you trust that your host properly secures + your account information? Furthermore, bugs exist in all but the most + trivial pieces of software, so what is to say that there is not a + vulnerability just waiting to be exploited in your host's system, completely + compromising your repository? + + It was not too long ago (March 4th, 2012) that [a public key security + vulnerability at + GitHub](https://github.com/blog/1068-public-key-security-vulnerability-and-mitigation) + was [exploited](https://gist.github.com/1978249) by a Russian man named + [Egor + Homakov](http://homakov.blogspot.com/2012/03/im-disappoint-github.html), + allowing him to successfully [commit to the master branch of the Ruby on + Rails + framework](https://github.com/rails/rails/commit/b83965785db1eec019edf1fc272b1aa393e6dc57) + repository hosted on GitHub. Oops. + +Friends and Coworkers/Colleagues +: There may be certain groups or individuals that you trust enough to (a) pull + or accept patches from or (b) allow them to push to you or a + central/"official" repository. Operating under the assumption that each + individual is truly trustworthy (and let us hope that is the case), that + does not immediately imply that their _repository_ can be trusted. What are + their security policies? Do they leave their PC unlocked and unattended? Do + they make a habit of downloading virus-laden pornography on an unsecured, + non-free operating system? Or perhaps, through no fault of their own, they + are running a piece of software that is vulnerable to a 0-day exploit. Given + that, _how can you be sure that their commits are actually their own_? + Furthermore, how can you be sure that any commits they approve (or sign off + on using `git commit -s`) were actually approved by them? + + That is, of course, assuming that they have no ill intent. For example, what + of the pissed off employee looking to get the arrogant, obnoxious co-worker + fired by committing under the coworker's name/email? What if you were the + manager or project lead? Whose word would you take? How would you even know + whom to suspect? + +Your Own Repository +: Linus Torvalds (original author of Git and the kernel Linux) [keeps a + secured repository on his personal computer, inaccessible by any + external means](http://www.youtube.com/watch?v=4XpnKHJAok8) to ensure + that he has a repository he can fully trust. Most developers simply keep + a local copy on whatever PC they happen to be hacking on and pay no mind + to security---their repository is likely hosted elsewhere as well, after + all; Git is distributed. This is, however, a very serious matter. + + You likely use your PC for more than just hacking. Most notably, you likely + use your PC to browse the Internet and download software. Software is buggy. + Buggy software has exploits and exploits tend to get, well, exploited. Not + every developer has a strong understanding of the best security practices + for their operating system (if you do, great!). And no---simply using + GNU/Linux or any other *NIX variant does not make you immune from every + potential threat. + +To dive into each of these a bit more deeply, let us consider one of the +world's largest free software projects---the kernel Linux---and how its +original creator Linus Torvalds handles issues of trust. During [a talk he +presented at Google in 2007](http://www.youtube.com/watch?v=4XpnKHJAok8), he +describes a network of trust he created between himself and a number of +others (which he refers to as his "lieutenants"). Linus himself cannot +possibly manage the mass amount of code that is sent to him, so he has +others handle portions of the kernel. Those "lieutenants" handle most of the +requests, then submit them to Linus, who handles merging into his own +branch. In doing so, he has trusted that these lieutenants know what they +are doing, are carefully looking over each patch and that the patches Linus +receives from them are actually from them. + +I am not aware of how patches are communicated from the lieutenants to Linus. +Certainly, one way to state with a fairly high level of certainty that the patch +is coming from one of his "lieutenants" is to e-mail the patches, signed with +their respective GPG/PGP keys. At that point, the web of trust is enforced by +the signature. Linus is then sure that his private repository (which he does his +best to secure, as aforementioned) contains only data that _he personally +trusts_. His repository is safe, so far as he knows, and he can use it +confidently. + +At this point, assuming Linus' web of trust is properly verified, how can he +confidently convey these trusted changes to others? He certainly knows his own +commits, but how should others know that this "Linus Torvalds" guy who has +been committing and signing off of on commits is _actually_ Linus Torvalds? As +demonstrated in the hypothetical scenario at the beginning of this article, +anyone could claim to be Linus. If an attacker were to gain access to any clone +of the repository and commit as Linus, nobody would know the difference. +Fortunately, one can get around this by signing a tag with his/her private key +using GPG (`git tag -s`). A tag points to a particular commit and that commit +[depends on the entire history leading up to that commit](#commit-history). +This means that signing the SHA1 hash of that commit, assuming no security +vulnerabilities within SHA1, will forever state that the entire history of the +given commit, as pointed to by the given tag, is trusted. + +Well, that is helpful, but that doesn't help to verify any commits made _after_ +the tag (until the next tag comes around that includes that commit as an +ancestor of the new tag). Nor does it necessarily guarantee the integrity of all +past commits---it only states that, _to the best of Linus' knowledge_, this +tree is trusted. Notice how the hypothetical you in our hypothetical story also +signed the tag with his/her private key. Unfortunately, he/she fell prey to +something that is all too common---human error. He/she trusted that his/her +"trusted" colleague could actually be fully trusted. Wouldn't it be nice if we +could remove some of that human error from the equation? + + +## Ensuring Trust {#trust-ensure} + +What if we had a way to ensure that a commit by someone named "Mike Gerwitz" +with my e-mail address is _actually_ a commit from myself, much like we +can assert that a tag signed with my private key was actually tagged by myself? +Well, who are we trying to prove this to? If you are only proving your identity +to a project author/maintainer, then you can identify yourself in any reasonable +manner. For example, if you work within the same internal network, perhaps you +can trust that pushes from the internal IP are secure. If sending via e-mail, +you can sign the patch using your GPG key. Unfortunately, _these only extend +this level of trust to the author/maintainer, not other users!_ If I were to +clone your repository and look at the history, how do I know that a commit from +"Foo Bar" is truly a commit from Foo Bar, especially if the repository +frequently accepts patches and merge requests from many users? + +Previously, only tags could be signed using GPG. Fortunately, [Git v1.7.9 +introduced the ability to GPG-sign individual +commits](http://git.kernel.org/?p=git/git.git;a=blob_plain;f=Documentation/RelNotes/1.7.9.txt;hb=HEAD)---a +feature I have been long awaiting. Consider what may have happened to the +story at the beginning of this article if you signed each of your commits +like so: + +```sh +$ git commit -S -m 'Fixed security vulnerability CVE-123' +# ^ GPG-sign commit +``` + +Notice the `-S` flag above, instructing Git to sign the commit using your +GPG key (please note the difference between `-s` and `-S`). If you followed this +practice for each of your commits---with no exceptions---then you (or anyone +else, for that matter) could say with relative certainty that the commit was +indeed authored by yourself. In the case of our story, you could then defend +yourself, stating that if the backdoor commit truly were yours, it would have +been signed. (Of course, one could argue that you simply did not sign that +commit in order to use that excuse. We'll get into addressing such an issue in a +bit.) + +In order to set up your signing key, you first need to get your key id using +`gpg --list-secret-keys`: + +```sh +$ gpg --list-secret-keys | grep ^sec +sec 4096R/8EE30EAB 2011-06-16 [expires: 2014-04-18] +# ^^^^^^^^ +``` + +You are interested in the hexadecimal value immediately following the forward +slash in the above output (your output may vary drastically; do not worry if +your key does not contain `4096R` as above). If you have multiple secret +keys, select the one you wish to use for signing your commits. This value will +be assigned to the Git configuration value `user.signingkey`: + +```sh +# remove --global to use this key only on the current repository +$ git config --global user.signingkey 8EE30EAB +# ^ replace with your key id +``` + +Given the above, let's give commit signing a shot. To do so, we will create a +test repository and work through that for the remainder of this article. + +```sh +$ mkdir tmp && cd tmp +$ git init . +$ echo foo > foo +$ git add foo +$ git commit -S -m 'Test commit of foo' + +You need a passphrase to unlock the secret key for +user: "Mike Gerwitz (Free Software Developer) " +4096-bit RSA key, ID 8EE30EAB, created 2011-06-16 + +[master (root-commit) cf43808] Test commit of foo + 1 file changed, 1 insertion(+) + create mode 100644 foo +``` + +The only thing that has been done differently between this commit and an +unsigned commit is the addition of the `-S` flag, indicating that we want +to GPG-sign the commit. If everything has been set up properly, you should be +prompted for the password to your secret key (unless you have `gpg-agent` +running), after which the commit will continue as you would expect, resulting in +something similar to the above output (your GPG details and SHA-1 hash will +differ). + +By default (at least in Git v1.7.9), `git log` will not list or validate +signatures. In order to display the signature for our commit, we may use the +`--show-signature` option, as shown below: + +```sh +$ git log --show-signature +commit cf43808e85399467885c444d2a37e609b7d9e99d +gpg: Signature made Fri 20 Apr 2012 11:59:01 PM EDT using RSA key ID 8EE30EAB +gpg: Good signature from "Mike Gerwitz (Free Software Developer) " +Author: Mike Gerwitz +Date: Fri Apr 20 23:59:01 2012 -0400 + + Test commit of foo +``` + +There is an important distinction to be made here---the commit author and the +signature attached to the commit _may represent two different people_. In other +words: the commit signature is similar in concept to the `-s` option, which adds +a `Signed-off` line to the commit---it verifies that you have signed off on +the commit, but does not necessarily imply that you authored it. To demonstrate +this, consider that we have received a patch from "John Doe" that we wish to +apply. The policy for our repository is that every commit must be signed by a +trusted individual; all other commits will be rejected by the project +maintainers. To demonstrate without going through the hassle of applying an +actual patch, we will simply do the following: + +```sh +$ echo patch from John Doe >> foo +$ git commit -S --author="John Doe " -am 'Added feature X' + +You need a passphrase to unlock the secret key for +user: "Mike Gerwitz (Free Software Developer) " +4096-bit RSA key, ID 8EE30EAB, created 2011-06-16 + +[master 16ddd46] Added feature X + Author: John Doe + 1 file changed, 1 insertion(+) +$ git log --show-signature +commit 16ddd46b0c191b0e130d0d7d34c7fc7af03f2d3e +gpg: Signature made Sat 21 Apr 2012 12:14:38 AM EDT using RSA key ID 8EE30EAB +gpg: Good signature from "Mike Gerwitz (Free Software Developer) " +Author: John Doe +Date: Sat Apr 21 00:14:38 2012 -0400 + + Added feature X +# [...] +``` + +This then raises the question---what is to be done about those who decide to +sign their commit with their own GPG key? There are a couple options here. +First, consider the issue from a maintainer's perspective---do we necessary +care about the identity of a 3rd party contributor, so long as the provided code +is acceptable? That depends. From a legal standpoint, we may, but not every user +has a GPG key. Given that, someone creating a key for the sole purpose of +signing a few commits without some means of identity verification, only to +discard the key later (or forget that it exists) does little to verify one's +identity. (Indeed, the whole concept behind PGP is to create a web of trust by +being able to verify that the person who signed using their key is actually who +they say they are, so such a scenario defeats the purpose.) Therefore, adopting +a strict signing policy for everyone who contributes a patch is likely to be +unsuccessful. Linux and Git satisfy this legal requirement with a +`"Signed-off-by"` line in the commit, signifying that the author agrees to the +[Developer's Certificate of +Origin](http://git.kernel.org/?p=git/git.git;a=blob;f=Documentation/SubmittingPatches;h=0dbf2c9843dd3eed014d788892c8719036287308;hb=HEAD); +this essentially states that the author has the legal rights to the code +contained within the commit. When accepting patches from 3rd parties who are +outside of your web of trust to begin with, this is the next best thing. + +To adopt this policy for patches, require that authors do the following and +request that they do not GPG-sign their commits: + +```sh +$ git commit -asm 'Signed off' +# ^ -s flag adds Signed-off-by line +$ git log +commit ca05f0c2e79c5cd712050df6a343a5b707e764a9 +Author: Mike Gerwitz +Date: Sat Apr 21 15:46:05 2012 -0400 + + Signed off + + Signed-off-by: Mike Gerwitz +# [...] +``` + +Then, when you receive the patch, you can apply it with the `-S` (capital, not +lowercase) to GPG-sign the commit; this will preserve the Signed-off-by line as +well. In the case of a pull request, you can sign the commit by amending it +(`git commit -S --amend`). Note, however, that the SHA-1 hash of the commit will +change when you do so. + +What if you want to preserve the signature of whomever sent the pull request? +You cannot amend the commit, as that would alter the commit and invalidate their +signature, so dual-signing it is not an option (if Git were to even support that +option). Instead, you may consider signing the merge commit, which will be +discussed in the following section. + + +## Managing Large Merges + +Up to this point, our discussion consisted of apply patches or merging single +commits. What shall we do, then, if we receive a pull request for a certain +feature or bugfix with, say, 300 commits (which I assure you is not unusual)? In +such a case, we have a few options: + +1. **Request that the user squash all the commits into + a single commit**, thereby avoiding the problem entirely by applying the + previously discussed methods. I personally dislike this option for a few + reasons: + + * We can no longer follow the history of that feature/bugfix in order to + learn how it was developed or see alternative solutions that were + attempted but later replaced. + + * It renders `git bisect` useless. If we find a bug in the software that + was introduced by a single patch consisting of 300 squashed commits, + we are left to dig through the code and debug ourselves, rather than + having Git possibly figure out the problem for us. + +2. **Adopt a security policy that requires signing only + the merge commit** (forcing a merge commit to be created with `--no-ff` + if needed). + + * This is certainly the quickest solution, allowing a reviewer to sign + the merge after having reviewed the diff in its entirety. + + * However, it leaves individual commits open to exploitation. For + example, one commit may introduce a payload that a future commit + removes, thereby hiding it from the overall diff, but introducing + terrible effect should the commit be checked out individually (e.g. by + `git bisect`). Squashing all commits ([option #1](#merge-1)), signing + each commit individually ([option #3](#merge-3)), or simply reviewing + each commit individually before performing the merge (without signing + each individual commit) would prevent this problem. + + * This also does not fully prevent the situation mentioned in the + hypothetical story at the beginning of this article---others can still + commit with you as the author, but the commit would not have been + signed. + + * Preserves the SHA-1 hashes of each individual commit. + +3. **Sign each commit to be introduced by the merge.** + + * The tedium of this chore can be greatly reduced by using + http://www.gnupg.org/documentation/manuals/gnupg/Invoking-GPG_002dAGENT.html[ + `gpg-agent`]. + + * Be sure to carefully review _each commit_ rather than the entire diff to + ensure that no malicious commits sneak into the history (see bullets + for [option #2](#merge-2)). If you instead decide to script the sign + of each commit without reviewing each individual diff, you may as well + go with [option #2](#merge-2). + + * Also useful if one needs to cherry-pick individual commits, since that would + result in all commits having been signed. + + * One may argue that this option is unnecessarily redundant, considering that + one can simply review the individual commits without signing them, then + simply sign the merge commit to signify that all commits have been + reviewed ([option #2](#merge-2)). The important point to note here is + that this option offers _proof_ that each commit was reviewed (unless + it is automated). + + * This will create a new for each (the SHA-1 hash is not preserved). + +Which of the three options you choose depends on what factors are important and +feasible for your particular project. Specifically: + +* If history is not important to you, then you can avoid a lot of trouble by + simply requiring the the commits be squashed ([option #1](#merge-1)). + +* If history _is_ important to you, but you do not have the time to review + individual commits: + + * Use [option #2](#merge-2) if you understand its risks. + + * Otherwise, use [option #3](#merge-3), but _do not_ automate the signing + process to avoid having to look at individual commits. If you wish to keep + the history, do so responsibly. + +Option #1 in the list above can easily be applied to the discussion in the +previous section. + + +### (Option #2) + +[Option #2](#merge-2) is as simple as passing the `-S` argument to `git +merge`. If the merge is a fast-forward (that is, all commits can simply be +applied atop of `HEAD` without any need for merging), then you would need to use +the `--no-ff` option to force a merge commit. + +```sh +# set up another branch to merge +$ git checkout -b bar +$ echo bar > bar +$ git add bar +$ git commit -m 'Added bar' +$ echo bar2 >> bar +$ git commit -am 'Modified bar' +$ git checkout master + +# perform the actual merge (will be a fast-forward, so --no-ff is needed) +$ git merge -S --no-ff bar +# ^ GPG-sign merge commit + +You need a passphrase to unlock the secret key for +user: "Mike Gerwitz (Free Software Developer) " +4096-bit RSA key, ID 8EE30EAB, created 2011-06-16 + +Merge made by the 'recursive' strategy. + bar | 2 ++ + 1 file changed, 2 insertions(+) + create mode 100644 bar +``` + +Inspecting the log, we will see the following: + +```sh +$ git log --show-signature +commit ebadba134bde7ae3d39b173bf8947a69be089cf6 +gpg: Signature made Sun 22 Apr 2012 11:36:17 AM EDT using RSA key ID 8EE30EAB +gpg: Good signature from "Mike Gerwitz (Free Software Developer) " +Merge: 652f9ae 031f6ee +Author: Mike Gerwitz +Date: Sun Apr 22 11:36:15 2012 -0400 + + Merge branch 'bar' + +commit 031f6ee20c1fe601d2e808bfb265787d56732974 +Author: Mike Gerwitz +Date: Sat Apr 21 17:35:27 2012 -0400 + + Modified bar + +commit ce77088d85dee3d687f1b87d21c7dce29ec2cff1 +Author: Mike Gerwitz +Date: Sat Apr 21 17:35:20 2012 -0400 + + Added bar +# [...] +``` + +Notice how the merge commit contains the signature, but the two commits involved +in the merge (`031f6ee` and `ce77088`) do not. Herein lies the problem---what +if commit `031f6ee` contained the backdoor mentioned in the story at the +beginning of the article? This commit is supposedly authored by you, but because +it lacks a signature, it could actually be authored by anyone. Furthermore, if +`ce77088` contained malicious code that was removed in `031f6ee`, then it would +not show up in the diff between the two branches. That, however, is an issue +that needs to be addressed by your security policy. Should you be reviewing +individual commits? If so, a review would catch any potential problems with the +commits and wouldn't require signing each commit individually. The merge itself +could be representative of "Yes, I have reviewed each commit individually and I +see no problems with these changes." + +If the commitment to reviewing each individual commit is too large, consider +[Option #1](#merge-1). + +### (Option #3) + +[Option #3](#merge-3) in the above list makes the review of each commit +explicit and obvious; with [option #2](#merge-2), one could simply lazily +glance through the commits or not glance through them at all. That said, one +could do the same with [option #3](#merge-3) by automating the signing of each +commit, so it could be argued that this option is completely unnecessary. Use +your best judgment. + +The only way to make this option remotely feasible, especially for a large +number of commits, is to perform the audit in such a way that we do not have +to re-enter our secret key passphrases for each and every commit. For this, +we can use +[`gpg-agent`](http://www.gnupg.org/documentation/manuals/gnupg/Invoking-GPG_002dAGENT.html), +which will safely store the passphrase in memory for the next time that it +is requested. Using `gpg-agent`, [we will only be prompted for the password +a single +time](http://stackoverflow.com/questions/9713781/how-to-use-gpg-agent-to-bulk-sign-git-tags/10263139). Depending +on how you start `gpg-agent`, _be sure to kill it after you are done!_ + +The process of signing each commit can be done in a variety of ways. Ultimately, +since signing the commit will result in an entirely new commit, the method you +choose is of little importance. For example, if you so desired, you could +cherry-pick individual commits and then `-S --amend` them, but that would +not be recognized as a merge and would be terribly confusing when looking +through the history for a given branch (unless the merge would have been a +fast-forward). Therefore, we will settle on a method that will still produce a +merge commit (again, unless it is a fast-forward). One such way to do this is to +interactively rebase each commit, allowing you to easily view the diff, sign it, +and continue onto the next commit. + +```sh +# create a new audit branch off of bar +$ git checkout -b bar-audit bar +$ git rebase -i master +# | ^ the branch that we will be merging into +# ^ interactive rebase (alternatively: long option --interactive) +``` + +First, we create a new branch off of `bar`---`bar-audit`---to perform the +rebase on (see `bar` branch created in demonstration of [option +#2](#merge-2)). Then, in order to step through each commit that would be +merged into `master`, we perform a rebase using `master` as the upstream +branch. This will present every commit that is in `bar-audit` (and +consequently `bar`) that is not in `master`, opening them in your preferred +editor: + +``` +e ce77088 Added bar +e 031f6ee Modified bar + +# Rebase 652f9ae..031f6ee onto 652f9ae +# +# Commands: +# p, pick = use commit +# r, reword = use commit, but edit the commit message +# e, edit = use commit, but stop for amending +# s, squash = use commit, but meld into previous commit +# f, fixup = like "squash", but discard this commit's log message +# x, exec = run command (the rest of the line) using shell +# +# If you remove a line here THAT COMMIT WILL BE LOST. +# However, if you remove everything, the rebase will be aborted. +# +``` + +To modify the commits, replace each `pick` with `e` (or `edit`), as shown above. +(In vim you can also do the following `ex` command: `:%s/^pick/e/`; +adjust regex flavor for other editors). Save and close. You will then be +presented with the first (oldest) commit: + +```sh +Stopped at ce77088... Added bar +You can amend the commit now, with + + git commit --amend + +Once you are satisfied with your changes, run + + git rebase --continue + +# first, review the diff (alternatively, use tig/gitk) +$ git diff HEAD^ +# if everything looks good, sign it +$ git commit -S --amend +# GPG-sign ^ ^ amend commit, preserving author, etc + +You need a passphrase to unlock the secret key for +user: "Mike Gerwitz (Free Software Developer) " +4096-bit RSA key, ID 8EE30EAB, created 2011-06-16 + +[detached HEAD 5cd2d91] Added bar + 1 file changed, 1 insertion(+) + create mode 100644 bar + +# continue with next commit +$ git rebase --continue + +# repeat. +$ ... +Successfully rebased and updated refs/heads/bar-audit. +``` + +Looking through the log, we can see that the commits have been rewritten to +include the signatures (consequently, the SHA-1 hashes do not match): + +```sh +$ git log --show-signature HEAD~2.. +commit afb1e7373ae5e7dae3caab2c64cbb18db3d96fba +gpg: Signature made Sun 22 Apr 2012 01:37:26 PM EDT using RSA key ID 8EE30EAB +gpg: Good signature from "Mike Gerwitz (Free Software Developer) " +Author: Mike Gerwitz +Date: Sat Apr 21 17:35:27 2012 -0400 + + Modified bar + +commit f227c90b116cc1d6770988a6ca359a8c92a83ce2 +gpg: Signature made Sun 22 Apr 2012 01:36:44 PM EDT using RSA key ID 8EE30EAB +gpg: Good signature from "Mike Gerwitz (Free Software Developer) " +Author: Mike Gerwitz +Date: Sat Apr 21 17:35:20 2012 -0400 + + Added bar +``` + +We can then continue to merge into `master` as we normally would. The next +consideration is whether or not to sign the merge commit as we would with +[option #2](#merge-2). In the case of our example, the merge is a +fast-forward, so the merge commit is unnecessary (since the commits being merged +are already signed, we have no need to create a merge commit using `--no-ff` +purely for the purpose of signing it). However, consider that you may perform +the audit yourself and leave the actual merge process to someone else; perhaps +the project has a system in place where project maintainers must review the code +and sign off on it, and then other developers are responsible for merging and +managing conflicts. In that case, you may want a clear record of who merged the +changes in. + + +## Enforcing Trust + +Now that you have determined a security policy appropriate for your particular +project/repository (well, hypothetically at least), some way is needed to +enforce your signing policies. While manual enforcement is possible, it is +subject to human error, peer scrutiny ("just let it through!") and is +unnecessarily time-consuming. Fortunately, this is one of those things that you +can script, sit back and enjoy. + +Let us first focus on the simpler of automation tasks---checking to ensure +that _every_ commit is both signed and trusted (within our web of trust). Such +an implementation would also satisfy [option #3](#merge-3) in regards to +merging. Well, perhaps not every commit will be considered. Chances are, you +have an existing repository with a decent number of commits. If you were to go +back and sign all those commits, you would completely alter the history of the +entire repository, potentially creating headaches for other users. Instead, you +may consider beginning your checks _after_ a certain commit. + +### Commit History In a Nutshell {#commit-history} + +The SHA-1 hashes of each commit in Git are created using the delta _and_ header +information for each commit. This header information includes the commit's +_parent_, whose header contains its parent---so on and so forth. In addition, +Git depends on the entire history of the repository leading up to a given commit +to construct the requested revision. Consequently, this means that the history +cannot be altered without someone noticing (well, this is not entirely true; +we'll discuss that in a moment). For example, consider the following branch: + +``` +Pre-attack: + +---o---o---A---B---o---o---H + a1b2c3d^ +``` + +Above, `H` represents the current `HEAD` and commit identified by `A` is the +parent of commit `B`. For the sake of discussion, let's say that commit `A` is +identified by the SHA-1 fragment `a1b2c3d`. Let us say that an attacker decides +to replace commit `A` with another commit. In doing so, the SHA-1 hash of the +commit must change to match the new delta and contents of the header. This new +commit is identified as `X`: + +``` +Post-attack: + +---o---o---X---B---o---o---H + d4e5f6a^ ^!expects parent a1b2c3d +``` + +We now have a problem; when Git encounters commit `B` (remember, Git must build +`H` using the entire history leading up to it), it will check its SHA-1 hash and +notice that it no longer matches the hash of its parent. The attacker is unable +to change the expected hash in commit `B`, because the header is used to +generate the SHA-1 hash for the commit, meaning `B` would then have a different +SHA-1 hash (technically speaking, it would not longer be `B`---it would be an +entirely different commit; we retain the identifier here only for demonstration +purposes). That would then invalidate any children of `B`, so on and so forth. +Therefore, in order to rewrite the history for a single commit, _the entire +history after that commit must also be rewritten_ (as is done by `git rebase`). +Should that be done, the SHA-1 hash of `H` would also need to change. Otherwise, +`H`'s history would be invalid and Git would immediately throw an error upon +attempting a checkout. + +This has a very important consequence---given any commit, we can rest +assured that, if it exists in the repository, Git will _always_ reconstruct +that commit exactly as it was created (including all the history leading up +to that commit _when_ it was created), or it will not do so at all. Indeed, +as Linus mentions in a presentation at Google, [he need only remember the +SHA-1 hash of a single commit](http://www.youtube.com/watch?v=4XpnKHJAok8) +to rest assured that, given any other repository, in the event of a loss of +his own, that commit will represent exactly the same commit that it did in +his own repository. What does that mean for us? Importantly, it means that +*we do not have to rewrite history to sign each commit*, because the history +of our _next_ signed commit is guaranteed. The only downside is, of course, +that the history itself could have already been exploited in a manner +similar to our initial story, but an automated mass-signing of all past +commits for a given author wouldn't catch such a thing anyway. + +That said, it is important to understand that the integrity of your +repository guaranteed only if a [hash +collision](https://en.wikipedia.org/wiki/Hash_collision) cannot be +created---that is, if an attacker were able to create the same SHA-1 hash +with _different_ data, then the child commit(s) would still be valid and the +repository would have been successfully compromised. [Vulnerabilities have +been known in +SHA-1](http://www.schneier.com/blog/archives/2005/02/cryptanalysis_o.html) +since 2005 that allow hashes to be computed [faster than brute +force](http://www.schneier.com/blog/archives/2005/02/sha1_broken.html), +although they are not cheap to exploit. Given that, while your repository +may be safe for now, there will come some point in the future where SHA-1 +will be considered as crippled as MD5 is today. At that point in time, +however, maybe Git will offer a secure migration solution to [an algorithm +like SHA-256](http://kerneltrap.org/mailarchive/git/2006/8/27/211001) or +better. Indeed, [SHA-1 hashes were never intended to make Git +cryptographically +secure](http://kerneltrap.org/mailarchive/git/2006/8/27/211020). + +Given that, the average person is likely to be fine with leaving his/her history +the way it is. We will operate under that assumption for our implementation, +offering the ability to ignore all commits prior to a certain commit. If one +wishes to validate all commits, the reference commit can simply be omitted. + +### Automating Signature Checks {#automate} + +The idea behind verifying that certain commits are trusted is fairly simple: + +> Given reference commit $r$ (optionally empty), let +> $C$ be the set of all commits such that $C$ = `r..HEAD` +> ([range spec](http://book.git-scm.com/4_git_treeishes.html)) and let +> $K$ be the set of all public keys in a given GPG keyring. We must assert +> that, for each commit $c$ in $C$, there must exist a key $k$ in +> keyring $K$ such that $k$ is +> [trusted](https://en.wikipedia.org/wiki/Web_of_trust) and can be used to +> verify the signature of $c$. This assertion is denoted by the function +> $g$ (GPG) in the following expression: $∀c∈C g(c)$. + +Fortunately, as we have already seen in previous sections with the +`--show-signature` option to `git log`, Git handles the signature verification +for us; this reduces our implementation to a simple shell script. However, the +output we've been dealing with is not the most convenient to parse. It would be +nice if we could get commit and signature information on a single line per +commit. This can be accomplished with `--pretty`, but we have an additional +problem---at the time of writing (in Git v1.7.10), the GPG `--pretty` options +are undocumented. + +A quick look at [`format_commit_one()` in +`pretty.c`](https://github.com/gitster/git/blob/f9d995d5dd39c942c06829e45f195eeaa99936e1/pretty.c#L1038) +yields a `'G'` placeholder that has three different formats: + +- *`%GG`*---GPG output (what we see in `git log --show-signature`) +- *`%G?`*---Outputs "G" for a good + signature and "B" for a bad signature; otherwise, an empty string ([see + mapping in `signature_check` + struct](https://github.com/gitster/git/blob/f9d995d5dd39c942c06829e45f195eeaa99936e1/pretty.c#L808)) +- *`%GS`*---The name of the signer + +We are interested in using the most concise and minimal representation --- +`%G?`. Because this placeholder simply matches text on the GPG output, and the +string `"gpg: Can't check signature: public key not found"` is not mapped in +`signature_check`, unknown signatures will output an empty string, not "B". +This is not explicit behavior, so I'm unsure if this will change in future +releases. Fortunately, we are only interested in "G", so this detail will not +matter for our implementation. + +With this in mind, we can come up with some useful one-line output per commit. +The below is based on the output resulting from the demonstration of +[merge option #3](#merge-3) above: + +```sh +$ git log --pretty="format:%H %aN %s %G?" +afb1e7373ae5e7dae3caab2c64cbb18db3d96fba Mike Gerwitz Modified bar G +f227c90b116cc1d6770988a6ca359a8c92a83ce2 Mike Gerwitz Added bar G +652f9aed906a646650c1e24914c94043ae99a407 John Doe Signed off G +16ddd46b0c191b0e130d0d7d34c7fc7af03f2d3e John Doe Added feature X G +cf43808e85399467885c444d2a37e609b7d9e99d Mike Gerwitz Test commit of foo G +``` + +Notice the "G" suffix for each of these lines, indicating that the signature +is valid (which makes sense, since the signature is our own). Adding an +additional commit, we can see what happens when a commit is unsigned: + +```sh +$ echo foo >> foo +$ git commit -am 'Yet another foo' +$ git log --pretty="format:%H %aN %s %G?" HEAD^.. +f72924356896ab95a542c495b796555d016cbddd Mike Gerwitz Yet another foo +``` + +Note that, as aforementioned, the string replacement of `%G?` is empty when the +commit is unsigned. However, what about commits that are signed but untrusted +(not within our web of trust)? + +``` +$ gpg --edit-key 8EE30EAB +[...] +gpg> trust +[...] +Please decide how far you trust this user to correctly verify other users' keys +(by looking at passports, checking fingerprints from different sources, etc.) + + 1 = I don't know or won't say + 2 = I do NOT trust + 3 = I trust marginally + 4 = I trust fully + 5 = I trust ultimately + m = back to the main menu + +Your decision? 2 +[...] + +gpg> save +Key not changed so no update needed. +$ git log --pretty="format:%H %aN %s %G?" HEAD~2.. +f72924356896ab95a542c495b796555d016cbddd Mike Gerwitz Yet another foo +afb1e7373ae5e7dae3caab2c64cbb18db3d96fba Mike Gerwitz Modified bar G +``` + +Uh oh. It seems that Git does not seem to check whether or not a signature is +trusted. Let's take a look at the full GPG output: + + +```sh +$ git log --show-signature HEAD~2..HEAD^ +commit afb1e7373ae5e7dae3caab2c64cbb18db3d96fba +gpg: Signature made Sun 22 Apr 2012 01:37:26 PM EDT using RSA key ID 8EE30EAB +gpg: Good signature from "Mike Gerwitz (Free Software Developer) " +gpg: WARNING: This key is not certified with a trusted signature! +gpg: There is no indication that the signature belongs to the owner. +Primary key fingerprint: 2217 5B02 E626 BC98 D7C0 C2E5 F22B B815 8EE3 0EAB +Author: Mike Gerwitz +Date: Sat Apr 21 17:35:27 2012 -0400 + + Modified bar +``` + +As you can see, GPG provides a clear warning. Unfortunately, +[`parse_signature_lines()` in +`pretty.c`](https://github.com/gitster/git/blob/f9d995d5dd39c942c06829e45f195eeaa99936e1/pretty.c#L808), +which references a simple mapping in `struct signature_check`, will +blissfully ignore the warning and match only `"Good signature from"`, +yielding "G". A patch to provide a separate token for untrusted keys is +simple, but for the time being, we will explore two separate +implementations---one that will parse the simple one-line output that is +ignorant of trust and a mention of a less elegant implementation that parses +the GPG output. ^[Should the patch be accepted, this article will be +updated to use the new token.] + + +#### Signature Check Script, Disregarding Trust {#script-notrust} + +As mentioned above, due to limitations of the current `%G?` implementation, we +cannot determine from the single-line output whether or not the given signature +is actually trusted. This isn't necessarily a problem. Consider what will +likely be a common use case for this script---to be run by a continuous +integration (CI) system. In order to let the CI system know what signatures +should be trusted, you will likely provide it with a set of keys for known +committers, which eliminates the need for a web of trust (the act of placing the +public key on the server indicates that you trust the key). Therefore, if the +signature is recognized and is good, the commit can be trusted. + +One additional consideration is the need to ignore all ancestors of a given +commit, which is necessary on older repositories where older commits will not be +signed (see [Commit History In a Nutshell](#commit-history) for information on +why it is unnecessary, and probably a bad idea, to sign old commits). As such, +our script will accept a ref and will only consider its children in the check. + +This script *assumes that each commit will be signed* and will output the SHA-1 +hash of each unsigned/bad commit, in addition to some additional, useful +information, delimited by tabs. + +```sh +#!/bin/sh +# +# Licensed under the CC0 1.0 Universal license (public domain). +# +# Validate signatures on each and every commit within the given range +## + +# if a ref is provided, append range spec to include all children +chkafter="${1+$1..}" + +# note: bash users may instead use $'\t'; the echo statement below is a more +# portable option +t=$( echo '\t' ) + +# Check every commit after chkafter (or all commits if chkafter was not +# provided) for a trusted signature, listing invalid commits. %G? will output +# "G" if the signature is trusted. +git log --pretty="format:%H$t%aN$t%s$t%G?" "${chkafter:-HEAD}" \ + | grep -v "${t}G$" + +# grep will exit with a non-zero status if no matches are found, which we +# consider a success, so invert it +[ $? -gt 0 ] +``` + +That's it; Git does most of the work for us! If a ref is provided, it will be +converted into a [range spec](http://book.git-scm.com/4_git_treeishes.html) by +appending `".."` (e.g. `a1b2c` becomes `a1b2c..`), which will cause `git log` +to return all of its children (_not_ including the ref itself). If no ref is +provided, we end up using `HEAD` without a range spec, which will simply list +every commit (using an empty string will cause Git to throw an error, and we +must quote the string in case the user decides to do something like `"master@{5 +days ago}"`). Using the `--pretty` option to `git log`, we output the GPG +signature result with `%G?`, in addition to some useful information we will want +to see about any commits that do not pass the test. We can then filter out all +commits that have been signed with a known key by removing all lines that end in +"G"---the output from `%G?` indicating a good signature. + +Let's see it in action (assuming the script has been saved as `signchk`): + +```sh +$ chmod +x signchk +$ ./signchk +f72924356896ab95a542c495b796555d016cbddd Mike Gerwitz Yet another foo +$ echo $? +1 +``` + +With no arguments, the script checks every commit in our repository, finding a +single commit that has not been signed. At this point, we can either check the +output itself or check the exit status of the script, which indicates a failure. +If this script were run by a CI system, the best option would be to abort the +build and immediately notify the maintainers of a potential security breach (or, +more likely, someone simply forgot to sign their commit). + +If we check commits after that failure, assuming that each of the children have +been signed, we will see the following: + +```sh +$ ./signchk f7292 +$ echo $? +0 +``` + +Be careful when running this script directly from the repository, especially +with CI systems---you must either place a copy of the script outside of the +repository or run the script from a trusted point in history. For example, if +your CI system were to simply pull from the repository and then run the script, +an attacker need only modify the script to circumvent this check entirely. + + +#### Signature Check Script With Web Of Trust {#script-trust} + +The web of trust would come in handy for large groups of contributors; in such a +case, your CI system could attempt to download the public key from a +preconfigured keyserver when the key is encountered (updating the key if +necessary to get trust signatures). Based on the web of trust established from +the public keys directly trusted by the CI system, you could then automatically +determine whether or not a commit can be trusted even if the key was not +explicitly placed on the server. + +To accomplish this task, we will split the script up into two distinct +portions---retrieving/updating all keys within the given range, followed by the +actual signature verification. Let's start with the key gathering portion, +which is actually a trivial task: + +```sh +$ git log --show-signature \ + | grep 'key ID' \ + | grep -o '[A-Z0-9]\+$' \ + | sort \ + | uniq \ + | xargs gpg --keyserver key.server.org --recv-keys $keys +``` + +The above string of commands simply uses `grep` to pull the key ids out of `git +log` output (using `--show-signature` to produce GPG output), and then requests +only the unique keys from the given keyserver. In the case of the repository +we've been using throughout this article, there is only a single signature---my +own. In a larger repository, all unique keys will be listed. Note that the +above example does not specify any range of commits; you are free to integrate +it into the `signchk` script to use the same range, but it isn't strictly +necessary (it may provide a slight performance benefit, depending on the number +of commits that would have been ignored). + +Armed with our updated keys, we can now verify the commits based on our web +of trust. Whether or not a specific key will be trusted is [dependent on +your personal +settings](http://www.gnupg.org/gph/en/manual.html#AEN533). The idea here is +that you can trust a set of users (e.g. Linus' "lieutenants") that in turn +will trust other users which, depending on your configuration, may +automatically be within your web of trust even if you do not personally +trust them. This same concept can be applied to your CI server by placing +its keyring in place of you own (or perhaps you will omit the CI server and +run the script yourself). + +Unfortunately, with Git's current `%G?` implementation, [we are unable to +check basic one-line output](#automate). Instead, we must parse the output +of `--show-signature` ([as shown above](#gpg-sig-untrusted)) for each +relevant commit. Combining our output with [the original script that +disregards trust](#script-notrust), we can arrive at the following, which is +the output that we must parse: + +```sh +$ git log --pretty="format:%H$t%aN$t%s$t%G?" --show-signature +f72924356896ab95a542c495b796555d016cbddd Mike Gerwitz Yet another foo +gpg: Signature made Sun 22 Apr 2012 01:37:26 PM EDT using RSA key ID 8EE30EAB +gpg: Good signature from "Mike Gerwitz (Free Software Developer) " +gpg: WARNING: This key is not certified with a trusted signature! +gpg: There is no indication that the signature belongs to the owner. +Primary key fingerprint: 2217 5B02 E626 BC98 D7C0 C2E5 F22B B815 8EE3 0EAB +afb1e7373ae5e7dae3caab2c64cbb18db3d96fba Mike Gerwitz Modified bar G +[...] +``` + +In the above snippet, it should be noted that the first commit (`f7292`) is +_not_ signed, whereas the second (`afb1e`) is. Therefore, the GPG output +_preceeds_ the commit line itself. Let's consider our objective: + +. List all unsigned commits, or commits with unknown or invalid signatures. +. List all signed commits that are signed with known signatures, but are + otherwise untrusted. + +Our [previous script](#script-notrust) performs #1 just fine, so we need only +augment it to support #2. In essence---we wish to convert lines ending in +"G" to something else if the GPG output _preceeding_ that line indicates that +the signature is untrusted. + +There are many ways to go about doing this, but we will settle for a fairly +clear set of commands that can be used to augment the previous script. To +prevent the lines ending with "G" from being filtered from the output (should +they be untrusted), we will suffix untrusted lines with "U". Consider the +output of the following: + +```sh +$ git log --pretty="format:^%H$t%aN$t%s$t%G?" --show-signature \ +> | grep '^\^\|gpg: .*not certified' \ +> | awk ' +> /^gpg:/ { +> getline; +> printf "%s U\n", $0; +> next; +> } +> { print; } +> ' \ +> | sed 's/^\^//' +f72924356896ab95a542c495b796555d016cbddd Mike Gerwitz Yet another foo +afb1e7373ae5e7dae3caab2c64cbb18db3d96fba Mike Gerwitz Modified bar G U +f227c90b116cc1d6770988a6ca359a8c92a83ce2 Mike Gerwitz Added bar G U +652f9aed906a646650c1e24914c94043ae99a407 John Doe Signed off G U +16ddd46b0c191b0e130d0d7d34c7fc7af03f2d3e John Doe Added feature X G U +cf43808e85399467885c444d2a37e609b7d9e99d Mike Gerwitz Test commit of foo G U +``` + +Here, we find that if we filter out those lines ending in "G" as we did +before, we would be left with the untrusted commits in addition to the commits +that are bad ("B") or unsigned (blank), as indicated by `%G?`. To accomplish +this, we first add the GPG output to the log with the `--show-signature` option +and, to make filtering easier, prefix all commit lines with a caret (^) which +we will later strip. We then filter all lines but those beginning with a caret, +or lines that contain the string "not certified", which is part of the GPG +output. This results in lines of commits with a single `"gpg:"` line before +them if they are untrusted. We can then pipe this to awk, which will remove all +`"gpg:"`-prefixed lines and append `"U"` to the next line (the commit line). +Finally, we strip off the leading caret that was added during the beginning of +this process to produce the final output. + +Please keep in mind that there is a huge difference between the conventional use +of trust with PGP/GPG ("I assert that I know this person is who they claim they +are") vs trusting someone to commit to your repository. As such, it may be in +your best interest to maintain an entirely separate web of trust for your CI +server or whatever user is being used to perform the signature checks. + + +### Automating Merge Signature Checks {#script-merge} + +The aforementioned scripts are excellent if you wish to check the validity of +each individual commit, but not everyone will wish to put forth that amount of +effort. Instead, maintainers may opt for a workflow that requires the signing +of only the merge commit ([option #2 above](#merge-2)), rather than each +commit that is introduced by the merge. Let us consider the appropach we would +have to take for such an implementation: + +> Given reference commit $r$ (optionally empty), let +> $C'$ be the set of all _first-parent_ commits such that $C'$ = `r..HEAD` +> ([range spec](http://book.git-scm.com/4_git_treeishes.html)) and let +> $K$ be the set of all public keys in a given GPG keyring. We must assert +> that, for each commit $c$ in $C$, there must exist a key $k$ in +> keyring $K$ such that $k$ is +> [trusted](https://en.wikipedia.org/wiki/Web_of_trust) and can be used to +> verify the signature of\ $c$. This assertion is denoted by the function +> $g$ (GPG) in the following expression: $∀c∈C′ g(c)$. + +The only difference between this script and the script that checks for a +signature on each individual commit is that *this script will only check for +commits on a particular branch* (e.g. `master`). This is important---if we +commit directly onto master, we want to ensure that the commit is signed (since +there will be no merge). If we merge _into_ master, a merge commit will be +created, which we may sign and ignore all commits introduced by the merge. If +the merge is a fast-forward, a merge commit can be forcefully created with the +`--no-ff` option to avoid the need to amend each commit with a signature. + +To demonstrate a script that can valdiate commits for this type of workflow, +let's first create some changes that would result in a merge: + +```sh +$ git checkout -b diverge +$ echo foo > diverged +$ git add diverged +$ git commit -m 'Added content to diverged' +[diverge cfe7389] Added content to diverged + 1 file changed, 1 insertion(+) + create mode 100644 diverged +$ echo foo2 >> diverged +$ git commit -am 'Added additional content to diverged' +[diverge 996cf32] Added additional content to diverged + 1 file changed, 1 insertion(+) +$ git checkout master +Switched to branch 'master' +$ echo foo >> foo +$ git commit -S -am 'Added data to master' + +You need a passphrase to unlock the secret key for +user: "Mike Gerwitz (Free Software Developer) " +4096-bit RSA key, ID 8EE30EAB, created 2011-06-16 + +[master 3cbc6d2] Added data to master + 1 file changed, 1 insertion(+) +$ git merge -S diverge + +You need a passphrase to unlock the secret key for +user: "Mike Gerwitz (Free Software Developer) " +4096-bit RSA key, ID 8EE30EAB, created 2011-06-16 + +Merge made by the 'recursive' strategy. + diverged | 2 ++ + 1 file changed, 2 insertions(+) + create mode 100644 diverged +``` + +Above, committed in both `master` and a new `diverge` branch in order to ensure +that the merge would not be a fast-forward (alternatively, we could have used +the `--no-ff` option of `git merge`). This results in the following (your hashes +will vary): + +``` +$ git log --oneline --graph +* 9307dc5 Merge branch 'diverge' +|\ +| * 996cf32 Added additional content to diverged +| * cfe7389 Added content to diverged +* | 3cbc6d2 Added data to master +|/ +* f729243 Yet another foo +* afb1e73 Modified bar +* f227c90 Added bar +* 652f9ae Signed off +* 16ddd46 Added feature X +* cf43808 Test commit of foo +``` + +From the above graph, we can see that we are interested in signatures on only +two of the commits: `3cbc6d2`, which was created directly on `master`, and +`9307dc5`---the merge commit. The other two commits (`996cf32` and `cfe7389`) +need not be signed because the signing of the merge commit asserts their +validity (assuming that the author of the merge was vigilant). But how do we +ignore those commits? + +``` +$ git log --oneline --graph --first-parent +* 9307dc5 Merge branch 'diverge' +* 3cbc6d2 Added data to master +* f729243 Yet another foo +* afb1e73 Modified bar +* f227c90 Added bar +* 652f9ae Signed off +* 16ddd46 Added feature X +* cf43808 Test commit of foo +``` + +The above example simply added the `--first-parent` option to `git log`, which +will display only the first parent commit when encountering a merge commit. +Importantly, this means that we are left with _only the commits on_ `master` (or +whatever branch you decide to reference). These are the commits we wish to +validate. + +Performing the validation is therefore only a slight modification to the +original script: + +```sh +#!/bin/sh +# +# Validate signatures on only direct commits and merge commits for a particular +# branch (current branch) +## + +# if a ref is provided, append range spec to include all children +chkafter="${1+$1..}" + +# note: bash users may instead use $'\t'; the echo statement below is a more +# portable option (-e is unsupported with /bin/sh) +t=$( echo '\t' ) + +# Check every commit after chkafter (or all commits if chkafter was not +# provided) for a trusted signature, listing invalid commits. %G? will output +# "G" if the signature is trusted. +git log --pretty="format:%H$t%aN$t%s$t%G?" "${chkafter:-HEAD}" --first-parent \ + | grep -v "${t}G$" + +# grep will exit with a non-zero status if no matches are found, which we +# consider a success, so invert it +[ $? -gt 0 ] +``` + +If you run the above script using the branch setup provided above, then you will +find that neither of the commits made in the `diverge` branch are listed in the +output. Since the merge commit itself is signed, it is also omitted from the +output (leaving us with only the unsigned commit mentioned in the previous +sections). To demonstrate what will happen if the merge commit is _not_ signed, +we can amend it as follows (omitting the `-S` option): + +```sh +$ git commit --amend +[master 9ee66e9] Merge branch 'diverge' +$ ./signchk +9ee66e900265d82f5389e403a894e8d06830e463 Mike Gerwitz Merge branch 'diverge' +f72924356896ab95a542c495b796555d016cbddd Mike Gerwitz Yet another foo +$ echo $? +1 +``` + +The merge commit is then listed, requiring a valid signature. ^[If you wish to +ensure that this signature is trusted as well, see [the section on verifying +commits within a web of trust](#script-trust).] + + +## Summary + +* [Be careful of who you trust.](#trust) Is your repository safe from + harm/exploitation on your PC? What about the PCs of those whom you trust? +** [Your host is not necessarily secure.](#trust-host) Be wary of using + remotely hosted repositories as your primary hub. +* [Using GPG to sign your commits](#trust-ensure) can help to assert your + identity, helping to protect your reputation from impostors. +* For large merges, you must develop a security practice that works best for + your particular project. Specifically, you may choose to [sign each + individual commit](#merge-3) introduced by the merge, [sign only the merge + commit](#merge-2), or [squash all commits](#merge-1) and sign the + resulting commit. +* If you have an existing repository, there is [little need to go rewriting + history to mass-sign commits](#commit-history). +* Once you have determined the security policy best for your project, you may + [automate signature verification](#automate) to ensure that no unauthorized + commits sneak into your repository.