Friday 28 December 2018

GIT: Creating and Applying Patch Files in Git

In a previous article, I talked about how to use git-cherry-pick to pluck a commit out of a repository branch and apply it to another branch.
It’s a very handy tool to grab just what you need without pulling in a bunch of changes you don’t need or, more importantly, don’t want.
This time the situation is the same. We have a commit we want to pull out of a branch and apply to a different branch. But our solution will be different.
Instead of using git-cherry-pick we will create a patch file containing the changes and then import it. Git will replay the commit and add the changes to the repository as a new commit.

What is git-fomat-patch?

git-format-patch exports the commits as patch files, which can then be applied to another branch or cloned repository. The patch files represent a single commit and Git replays that commit when you import the patch file.
git-format-patch is the first step in a short process to get changes from one copy of a repository to another. The old style process, when Git was used locally only without a remote repository, was to email the patches to each other. This is handy if you only need to get someone a single commit without the need to merge branches and the overhead that goes with that.
The other step you have to take is to import the patch. There are a couple options for that but we’ll use the simplest one available.
Let’s create our patch file.

Using git-format-patch

I am on the repository the-commits, I have the experimental_featuresbranch checked out.
This experimental_features branch has an important change in it that I want to bring to a feature branch I have going. This feature branch is going to be merged into the development branch (and eventually the master branch) so I only want to include non-experimental changes. Because of that I don’t want to do a merge because I’d like to not pull in the other features that are half-baked and would mess up my production-path branches.
Here’s the latest when I run git-log:
$ git log
commit 4c7d6765ed243b1dbb11d8ca9a28548561e1e2ef
Author: Ryan Irelan 
Date:   Wed Aug 24 08:08:59 2016 -0500

another experimental change that I don't want to allow out of this branch

commit 1ecb5853f53ef0a75a633ffef6c67efdea3560c4
Author: Ryan Irelan 
Date:   Mon Aug 22 12:25:10 2016 -0500

a nice change that i'd like to include on production

commit 4f33fb16f5155165e72b593a937c5482227d1041
Author: Ryan Irelan 
Date:   Mon Aug 22 12:23:54 2016 -0500

really messed up the content and markup and you really don't want to apply this commit to a production branch

commit e7d90143d157c2d672276a75fd2b87e9172bd135
Author: Ryan Irelan 
Date:   Mon Aug 22 12:21:33 2016 -0500

rolled out new alpha feature to test how comments work
The commit with the hash 1ecb5853f53ef0a75a633ffef6c67efdea3560c4 is the one I’d like to pull into my feature branch via a patch file.
We do that using the command git-format-patch. Here’s the command:
$ git format-patch a_big_feature_branch -o patches
We pass in the branch with which we want Git to compare against to create the patch files. Any commits that are in our current branch (experimental_features) but not in the a_big_feature_branch will be exported as patch files. One patch file per commit. We used the -o flag to specify the directory where we want those patches saved. If we leave that off, Git will save them to the current working directory.
When we run it we get this:
$ git format-patch a_big_feature_branch
patches/0001-rolled-out-new-alpha-feature-to-test-how-comments-wo.patch
patches/0002-really-messed-up-the-content-and-markup-and-you-real.patch
patches/0003-a-nice-change-that-i-d-like-to-include-on-production.patch
patches/0004-another-experimental-change-that-I-don-t-want-to-all.patch
Those four patch files (named sequentially and with a hyphenated version of the commit message excerpt) are the commits that are in the current branch but not the a_big_feature_branch.
Let’s look at the guts of one of them.
From 4c7d6765ed243b1dbb11d8ca9a28548561e1e2ef Mon Sep 17 00:00:00 2001
From: Ryan Irelan 
Date: Wed, 24 Aug 2016 08:08:59 -0500
Subject: [PATCH 4/4] another experimental change that I don't want to allow out of this branch

---
 index.html | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/index.html b/index.html
index f92d848..46e4eb2 100644
--- a/index.html
+++ b/index.html
@@ -9,7 +9,7 @@
   <!-- Set the viewport width to device width for mobile -->
   <meta name="viewport" content="width=device-width" />
 
 <title>Little Git &amp; The Commits</title>
 <title>Little Git &amp; The Commits FEATURING ELVIS BACK FROM THE DEAD</title>
 
   <!-- Included CSS Files (Uncompressed) -->
-- 
2.7.4 (Apple Git-66)
It looks like an email, doesn’t it? That is because all patch files are formatted to look like the UNIX mailbox format. The body of the email is the diff that shows which files have changed (in our case just index.html) and what those changes are. Using this file, Git will recreate the commit in our other branch.

Specifying a Single Commit

In this situation, I don’t need all of those patch files. All but one are commits I don’t want in my target branch. Let’s improve the git-format-patchcommand so it only creates a patch for the one commit we do want to apply.
Looking back at the log, I know that the commit I want to apply has the hash of 1ecb5853f53ef0a75a633ffef6c67efdea3560c4. We include that hash as an argument in the command, but precede it with a -1 so Git only formats the commit we specify (instead of the entire history since that commit).
$ git format-patch a_big_feature_branch -1 1ecb5853f53ef0a75a633ffef6c67efdea3560c4 -o patches 
  outgoing/0001-a-nice-change-that-i-d-like-to-include-on-production.patch
Now we get a single patch file, which is much safer because there’s no change we’ll accidentally apply patches of changes we don’t want!
We have the patch file, now how do we apply it to our branch?Using git-am.

What is git-am?

git-am is a command that allows you to apply patches to the current branch. The am stands for “apply (from a) mailbox” because it was created to apply emailed patches. The handy thing about git-am is that it applies the patch as a commit so we don’t have to do anything after running the command (no git-addgit-commit etc.).
The name git-am is a little strange in the context of how we’re using it but fear not: the result is exactly what we want.
Let’s apply a patch and see how it works.

Using git-am

The first thing we need to is switch over to our target branch. For this example we’ll move to the branch we compared against in the git-format-patchcommand.
$ git checkout a_big_feature_branch
After that we’re ready to apply the patch file with the commit we want to include.
Note: I’m working in the same repository on the same computer. When I switch branches, the patch file comes with me because it is still an untracked file. If I staged and committed the patch file then I’d need to find another way to make it accessible. You could do this by moving the patch file out of your repository to where you can access it when on the destination branch.
Because we refined the git-format-patch we only have one patch file in the patches directory:
patches/0001-a-nice-change-that-i-d-like-to-include-on-production.patch
To apply the patch to the current branch, we use git-am and pass in the name of the patch we want to apply.
$ git am patches/0001-a-nice-change-that-i-d-like-to-include-on-production.patch
And the we get confirmation that the patch was successfully applied:
Applying: a nice change that i'd like to include on production
Looking at the log now we see our change is replayed as a commit in the current branch:
$ git log
commit 69bb7eb757b2356e365934fdbea744877c3092bb
Author: Ryan Irelan 
Date:   Mon Aug 22 12:25:10 2016 -0500

a nice change that i'd like to include on production
And now our change is there!
Note that the new commit has a different hash because it is part of a different working tree than the one we formatted as a patch.

GIT: Using git-cherry-pick

Recently I ran into a problem on a project where I was working on the wrong branch and committed changes. Those commits were supposed to go elsewhere and I now I need to get them into the correct branch!
There are a few options to solve this:
  • Merge the incorrect branch into the correct branch. But I don’t want to that because there are items in the incorrect branch that I don’t want. So, that’s out.
  • Recreate those changes in my working branch and just go on with my day. But that’s a waste of my time and I’m adamantly against redoing work!
  • Create a patch and then apply that patch to the new branch.
All solid options but there’s still something better:
$ git cherry-pick
Let’s review where we’re at and then how to solve the problem using git-cherry-pick.

The State of Things

I created a new commit in my repository, in the branch called some_other_feature. But that’s the wrong branch!
$ git branch
    develop
    master
    my_new_feature
*   some_other_feature
    stage
The new commit should be on the my_new_feature branch. I could merge the branches but the some_other_feature branch contains commits and changes that I don’t want in the other branch (they are not ready for merging into any upstream branches, like develop or master.
Here’s the commit I need to get into my_new_feature:
commit ec485b624e85b2cad930cf8b7c383a134748b057
Author: Ryan Irelan 
Date:   Fri Aug 19 10:44:47 2016 -0500

    new contact page

Using git-cherry-pick

The syntax of git-cherry-pick is this:
$ git cherry-pick [commit hash]
The first step is fetch the commit hash for the commits we want to cherry-pick. We can do that using git-log and then copying the hash (in full or just the last 7 characters will work) to our clipboard.
Next, we need to be on the branch where we want the changes to be (our destination branch).
$ git checkout my_new_feature
Now we can run the git-cherry-pick command and apply the commit to our destination branch.
$ git cherry-pick ec485b624e85b2cad930cf8b7c383a134748b057
This will return something like this:
[my_new_feature 1bf8955] new contact page
Date: Fri Aug 19 10:44:47 2016 -0500
1 file changed, 1 insertion(+)
create mode 100644 contact.html
If we look at our log for this branch, we now see our commit:
$ git log
commit 1bf8955d5e6ca71633cc57971379e86b9de41916
Author: Ryan Irelan 
Date:   Fri Aug 19 10:44:47 2016 -0500

    new contact page
What’s happening when we run git-cherry-pick?
  • Git is fetching the changes in the specified commit and replaying them in the current branch. The commits are not removed from the source branch; they remain intact.
  • Because this commit is being applied to a new branch and therefore has different contents it will get a different hash than the source commit.
With the problem solved, we are ready to move on with our development work!

GIT: The Pieces of Git

As far as the Git workflow is concerned, there are three pieces of Git that we should be aware of before moving forward with some slightly more complex (but totally doable!) explanation.
They are:
  • Repository
  • Index
  • Working Tree
Let’s cover each one in a bit more detail.

Repository

A repository is a collection of commits, and a record of what the project’s working tree looked like at one time. You can access the history of commits via the Git log.
There’s always a current starting point in a repository and that’s called the HEAD. A repository also contains tags and branches.
The job of the repository is to be the container that tracks the changes to your project files.

Working Tree

This is a directory on your file system that is associated with a repository.
You can think of this as the filesystem manifestation of the repository. It’s full of the files you edit, where you add new files, and from which you remove unneeded files. Any changes to the Working Tree are noted by the Index (see below), and show up as modified files.

Index

This is a middle area that sits between your Git repository and your Working Tree.
The Index keeps a list of the files Git is tracking and compares them to your Working Tree when you make changes. These changed files show up as modified before you bundle them up into a commit.
You might have heard this called the staging area where changes go before they are committed to the repository as commit objects.
If you ever use the -a flag when committing, then you are effectively bypassing the Index by turning your changes directly into a commit without staging them first.

GIT: The Big Picture of Git

When starting out with Git, it’s much easier to understand how to use it if we also understand the basics of what Git is.
By that I mean what Git is beyond the commands. Conceptually speaking.

What is a Version Control System (VCS)?

Git is a version control system. You’ll hear this referred to as a VCS sometimes. This is a general term to describe tools that track changes to files over a period of time. The reason we track the changes is so we can revert to previous versions, have a log of the work that was completed, as well as have an incremental backup of the project.
Version control systems aren’t just for code but that’s where they got their start and are most widely used.
You could–and I have–use version control for a book you’re writing.
You could use version control for designs you’re creating for a project.
You could use version control for a series of documents, like proposals, contracts, or agreements.
If it’s a file, it can be tracked by a version control system.
Okay, great. So why should use a version control system for your projects?
First and foremost is because it’s a reliable way to track changes to your files. You might’ve used a simple version control system in the past where you zipped up a project and put a date on it, thereby capturing that day’s work. A version control system also lets you snapshot changes to the project but in a more structured and reliable way.

Getting Git

There are lot of other version control systems out there and perhaps you’ve used one or two before. In the past I’ve used CVS, Subversion, and Mercurial, in addition to Git. I prefer Git because it is local, simple, and fast.
So let’s talk briefly and at a high level of how Git works.

Snapshots of Your Project

When you initialize a new Git repository, you are telling Git to track a set of files.
This set of files will be in a common directory (and subdirectories).
Git now cares about what happens to those files.
At first Git will just tell you that it sees a bunch of files but isn’t yet tracking them. It’ll be up to you to add those files so Git will care about them on an individual basis. After that, Git will notice every change to the file.
By initializing a Git repository, you tell Git: “Hey I want you to pay attention to this set of files. If you don’t know about a file yet, please tell me. If something changes in a file, please tell me. In return, I’ll commit those changes to your repository so you are aware.
Let me try an analogy and see if this works.
Imagine you’re a summer camp counselor. Your job is to take care of, educate, and otherwise entertain a specific group of kids. The kids are organized into groups.
You arrive at the camp on Day 1 and you know you’re responsible for a group of kids. You walk into the room and your supervisor says: “Okay, Ryan, this is your group.”
You look out at the group of kids and say “Okay, ya’ll are mine. I will take care of you.”
“First order of business. I see I have 15 kids here. But I don’t know your names. I need you to form a line and then give me your names so I can write them on this list. This is the list I will use to track your progress at camp so I can report back to your parents when they pick you up on Sunday.”
The kids form a line and, one by one, they give you their name and you write it down on the paper attached to your clipboard. You now have them tracked and know exactly who the kids are.
As summer camp goes on, you watch your campers and their actives. If Suzanne swims a 1km lap in the lake, you mark that change down next to her name on the list.
If Albert gets sick because he ate too many bowls of chocolate pudding after dinner, then you mark that down next to his name to track his health.
The bottom line is: you are watching these kids and tracking how they change during their time at camp.
This analogy may be a little thin but I hope you get the idea. Key takeaway: Think about Git as a system that watches a set of files you tell it to.
This is the git init command.
Once you tell it to watch a set of files, you then have to introduce it to each file.
This is the git add command.
Just like with the campers and and counselor, think about your Git repository as the set of files. Every time you make a change to a file and commit that change (record it as a change), Git snapshots the state of the entire collection of files at that moment.
This might sound like Git is just zipping up the files and such. 
Git only saves and records changes to the files that have changed. For the other files, the snapshot simply points to the previous snapshot. This allows Git to efficiently store files without becoming unnecessarily large over the lifecycle of a project.
So, what is Git doing?
It’s caring about your project files because you told it to. Every time you work with Git think of it in this way. Git cares a lot and sometimes you have to tell it no longer care (git ignore), too.
And, as you’ll learn in a different video on merging branches in Git, Git cares so much that it won’t let you lose changes or work. It really is looking out for you.

GIT: Git Merge Therapy

Git merge conflicts are normal and okay.
They are supposed to happen and, most likely, will happen regularly. They don’t happen because you did something wrong. They happen because Git is trying to protect you from losing your hard work.
If you’re new to Git or haven’t done extensive work with it in a team environment then you probably haven’t had the experience of a lot of merge conflicts.
In a future article I will share how you can resolve conflicts but right now, let’s talk about prevention.
We can do some things to prevent conflicts; here’s a list of a few:
  • Ignore generated files
  • Ignore cache or other runtime directories and files
  • Have a good branching strategy
  • Avoid whitespace errors

GIT: Using Git Hooks

Git hooks are similar to SVN hooks; you can execute a script at a point in the Git routine. Git hooks are both local and server-side. The local hooks pertain to local activities, like committing, merging, checking out, etc. The server-side hooks deal with receiving pushes from a local client.
The scripts that Git hooks execute are stored in the .git/hooks directory of your project. In every repository there are a collection of sample hooks that you can rename and use as inspiration for your own.
The scripts can be in any language that support executable scripts. The samples are mostly shell scripts, but you can use Perl, Ruby, Python, or something else that is familiar to you.
Here are all the hooks available in Git:
  • pre-commit
  • prepare-commit-msg
  • commit-msg
  • post-commit
  • applypatch-msg
  • pre-applypatch
  • post-applypath
  • pre-rebase
  • post-rewrite
  • post-checkout
  • post-merge
  • pre-push
  • pre-receive
  • update
  • post-receive

Local Git Hooks

Local Git hooks reside in the .git/hooks directory and are not treated as project files. They are not tracked in the Index and because of that they are not included when someone clones a repository.
Because of this there are some limitations to keep in mind while working with Git hooks. First and foremost: local hooks are not a reliable way to enforce policies, like a commit message structure. And, because you are unable to include the contents of the .git/hooks directory in version control, you will need to consider a way to share hooks that you’d like your team to use.
Here are the local hooks:
  • pre-commit
  • prepare-commit-msg
  • commit-msg
  • post-commit
  • applypatch-msg
  • pre-applypatch
  • post-applypath
  • pre-rebase
  • post-rewrite
  • post-checkout
  • post-merge
  • pre-push

Server-side Hooks

Server-side hooks are set up to kick off scripts when a typical remote repository actions takes place.
There are only three server-side hooks:
  • pre-receive
  • update
  • post-receive
They all operate around when the server receives a git-push from a client. These are reliable methods for enforcing some sort of repository or standard. You can run checks, kick of an integration script, testing, etc.

Implement a Git Hook

For our example we’re going to implement a local hook. We’d like to display a message after each commit, reminding us to push our commits to the remote repository.
The setup for this message will be to use the post-commit hook. This hook is triggered after each successful commit. For our implementation, we want the hook to trigger and display a message for us. 
There isn’t a sample file for post-commit already in place. That’s okay, we can create it.
Create a new file called post-commit in .git/hooks.
I’m going to use Ruby as the scripting language. Remember, Git Hooks can use any executable script. But I like Ruby so we’ll use that. This Ruby code is as simple as it comes, with just a single puts statement.
Add this code to the post-commit file:
#!/usr/local/bin
puts "======================="
puts "Please remember to push your commits!"
puts "======================="
With the code in place, we can then save the hook file and set it as executable.
$ chmod +x post-commit
Test it by making a change to our project and creating a commit. If you see the message, it worked!
If you didn’t, check your code, that the hook file is executable, and that you have it properly named.
In another article I’ll talk about triggering hooks on the server.

GIT: The Slowest Git Commit in the World


In Git there’s the concept of “porcelain” commands and “plumbing” commands.
This obvious allusion to the toilet and its two types of interfaces. The simple, yet functional, porcelain. You interface with it and get the job done.
Behind the scenes is the plumbing. This does the dirty work of completing the job.
In Git, we have the same thing. The porcelain commands are the commands that you’ll typically use from the command line. git commit, etc.
The plumbing commands are the low level commands that make up the Git system. They are the commands that do the, uh, dirty work, and make your repository track and manage files and changes.
In this video, follow along as Ryan uses Git plumbing commands to manually hash, create and commit objects in Git.
We’ll start off by creating a new directory for our project and initializing a fresh repository.
And we’ll end up with the slowest Git commit in the world.