This is the sixth article in a series on the basics of using Git. The other articles in the series are:
- Basic Git for DBAs: Getting Started with Git
- Basic Git for DBAs: Sharing Files Through GitHub
- Basic Git for DBAs: the Basics of Branches
- Basic Git for DBAs: Making Changes in GitHub
- Basic Git for DBAs: Merging Code Between Branches
- Basic Git for DBAs: What's a Pull Request?
- Basic Git for DBAs: Managing PowerShell Scripts
- Basic Git for DBAs: Ignoring Files and Customizing Your Environment
In the previous articles, we set up a repository, shared files, branched code, and merged it back together. In this article, we will look at pull requests, which have become quite common in modern software development.
Code Reviews
Early in my career, I got a job working with a large corporation writing code. I worked on a team of about 20 developers, and we managed a number of internal projects at any one time. Often it was 1-3 of us that were writing code, either for a new application or fixing bugs in an existing one. This was before DevOps, and we completed an application and turned it over to an operations team for clients to use.
We didn't have much feedback from our live software, only tickets to address problems. We did, however, take pride in our work. My boss didn't want us throwing code over some wall and hoping for the best. What we did have was a strong code review process. This was the type of thing I'd heard about in University, and we adhered to it. When I had written and tested code, I had to find two other developers to review it with me. I'd print out 2 or 3 copies and highlight the places I'd made changes to the codebase. We had a relatively short 10-20 minute meeting to go over my work before it could be sent to a QA group.
That built good habits, and I've seen a few organizations that followed a similar process, but until DevOps gained some momentum in the early 2010s, most clients I worked with trusted developers to write code that could get through a QA process. There wasn't a lot of peer review in many organizations. This didn't stop a number of companies, like Smart Bear, from building products to help with code review. Not everyone avoided review, but it wasn't as common as I'd have hoped it would be.
The Pull Request
As Linux and Git gained popularity and there were more contributors, there needed to be an easy way for developers to let others know their code was ready to be merged. At some point this became a request to "pull my changes" into another branch. Git actually has a "git request-pull" command to generate a synopsis of changes.
With the popularity of GitHub as the most well known git hosting organization, this became formalized into a pull request (or PR) terminology. These hosting providers incorporated this feature into their offering and developers have become used to using this as a way of communicating. There are other terms used with some hosting providers, but the industry seems to have standardized on pull request being the way in which we ask others to review out code and discuss its merits.
In this article, we will look at how to create and work with a pull request on GitHub, though the process is similar in Azure DevOps, BitBucket, and other providers.
Creating a Pull Request on GitHub
In a previous article we looked a branching, and then at merging in a second one. We need both those skills for our pull request. If you don't understand those topics, I'd encourage you to go read the other articles.
I'm going to create a branch with one of my users, sjonesdkranch. I'll do this and with a base of main, I'll branch to "feature-glenn4". My intention is to add a fourth script from Glenn Berry's diagnostic queries. Once I have the branch, I'll add the new file (GlennBerry_2017Diagnostic_4.sql) with some code for checking backup status. Once I do this, I'll commit this and push it to the repo on GitHub. When I do that, and log in as my user, I see something like the image below. In this case, at the top of my list of code, I have a notice of a push and a button to create a pull request. I also see the branch I created is the current branch and my user pushed the last commit.
Let's create the pull request for someone else to check the code. When I click the top green button, I see this image:
The most important part is above the title box. Here is where we see the gray bar that shows this is merging into "main" from "feature-glenn4". If I need to merge to a different branch, I should click the drop down and choose that one. Here, however, I'm just working feature branches off main, which is what many DBAs might do with scripts like these.
The default title for the pull request is my commit title. This is good, because often I have someone reviewing what I've committed. If I have a few commits stacked up, I can edit this.
Below that I have a large edit box where I can write more information, add images, etc. This is a standard post edit box where I can describe what the reviewed should look for. This has defaulted to my commit message, but I should ensure this is a good explanation for why I am requesting the code to be reviewed. I'll edit this, but there are a few other things to look at.
On the right side, I have a number of sections. Each of these can add some context to my request. I'll talk about each of these below.
Reviewers: I can pick people from my organization that can review this PR. In this case, I only have one other user with rights, but I can also leave this blank. If I do, then anyone can pick it up, but there aren't necessarily notifications.
Assignees: I can assign specific people to look at this. I might do this if I want a specific person with domain knowledge to look at this. Typically in a team I might leave this blank if anyone can review the PR. Note, I can assign myself the PR.
Labels: These are items that describe the PR itself. For example, I might note this as an enhancement, if that's appropriate.
Projects/Milestones/Linked Issues: Often we may assign work into projects, like this might be the DiagnosticQuery project. In my case, I have no projects, but once I get into a team, I often do have some projects. I can link this item to a project. The same goes for milestones and issues. I can link this to those items. I'll look at these items in a later article.
Here is my updated pull request. This has more data filled out.
At this point, I'll click the "Create pull request" button. This will create the request and give me a screen letting me know this. The PR has some information that tracks what happens. I see my commit at the bottom of a graph, then the label, and then the PR itself at the top. Below that I also see an automatic evaluation by GitHub as to whether this PR will cause any conflicts on merge. You can see that this won't.
Below this, I have a comment box, where I can add a comment to the PR if I need to. I also can edit the various metadata items on the right.
The PR is open, but what does that mean? To find out, let's see what my other user, wayOutwest, sees.
Notifications
You can configure notifications, but typically anyone associated with an item is notified when it changes. Or if they are associated with the repo, they get notified for general items. In this case, I actually got an email at my Redgate account. This let's me know the title of this, I see the comment in the email, and I have links.
If I go to the main repo page, I don't see anything as far as a banner, but there is a "1" next to Pull Requests, showing me one is open.
I can click the Pull Request header, or click the top link in my email. The email also has a link directly to the changed file and patch links that show the direct markup that git would add to the files.
When I get to the main PR page, I see my single PR open. If there were more, and in a busy project and large team there will be, I'd see them all listed.
I can click on that one item, and I'll see the details. This looks just like the last screen my sjonesdkranch user saw, with the graph of the PR and details.
I can see there are no conflicts, and I can comment or change metdata. For now, I'll click the drop down next to the Merge button. When I do, I see three options: Merge, squash, or rebase. We won't cover the squash and rebase here, but for now, we'll make a merge commit.
If I choose that, the button changes to "confirm merge". While slightly annoying for routine work, we don't want to accidentally commit something wrong. That's a bigger pain to undo, especially when there's a CI flow and other work stacking up. I can choose which email to associate with this when I have multiple ones in GH.
When this is done, I can see the closed PR. Tracking continues, showing me that I merged this with a new commit. This merged into the main branch, and if I were to look at that branch, I'd see this file as a part of it. Before this, only the feature branch had the file visible.
By default, the main PR page doesn't show closed PRs, and we can see that when we go back and look at this. You can see this as the default filter above the list is "is:pr is: open", two items.
My sjonesdkranch user also gets a notification of the merge in email.
This is a happy path, common way that we use a pull request to merge changes with some check.
Feedback
Let's look at a more complex PR. In this scenario I'll add a third user to my repo and then I'll add a couple commits to a new branch. I'll add the user, way0utwest2, to my list of collaborators. Once this is done, I'll have way0utwest2 do this:
- Accept the invitation to collaborate
- Create a feature branch, feature-sales
- alter the salesheader.sql file to add a column.
- commit this change
- alter the readme file
- commit this change
- create a pr with two changes to main.
While this happens, I'll also have way0utwest do the following:
- change to the feature-spBlitz branch
- alter the readme file
- alter the salesheader.sql file
- commit both changes in 1 commit.
- merge these changes to main.
This is a more complex scenario, but I want to show a few things about what can happen in a PR. Once I've done this, my way0utwest user sees a pull request on the site. If I open the PR, I see a few things in the image below. I see the commits in the PR (2 of them), and I see the label assignment with the assignment to the sjonesdkranch user. I also see there is a conflict with the main branch. Unlike the PR above, this time two people have edited the same file in a way that GitHub can't easily resolve by itself.
In this case, I don't like this as the owner of the repo, because I want to be able to pick this up. I'd prefer the PRs aren't assigned.
I'll click Comment. This puts a comment on the PR. If I return to the view from the way0utwest2 user, I see this:
I can convert this back to draft (upper right), if things are wrong. In this case, however, I can do a couple things. First, I'll remove the assignment to a specific user. Then I'll add another comment to the repo noting these changes. This acknowledges what is going on.
While this is happening, I'll also go into the repo as sjonesdkranch and add a comment on resolving the issues. When I do that, my open browser page for way0utwest2 actually updates with the comment. I also get an email.
The email note contains the comment as well:
If I click the "Resolve Conflicts" button on the PR, I'll get this page. A couple things to note. First, this is going to resolve the conflicts between the branches and commit to feature-sales. We see this at the top below the PR title. Second, the markup shows me my changes (<<<<) and the changes in main (>>>>>).
I am going to follow the advice in the comment and move the Complete column first. When I edit this, I remove the git markup lines. Note, this does in no way test the code changes. I really need to fix my local database if I'm working on this code. Really, I'd want to drop and recreate this table in dev to be sure my edits are correct. There is no code validation here, so syntax errors aren't caught.
When done, things look like this:
Once done, there is a "mark as resolved" button in the upper right. I click that I can see resolved, and then get a "Commit merge" button. Note, double check where this commit is going (feature-sales).
When I do that, I can see in the flow of the PR, that I have a new commit and that now we have no conflicts.
The PR is still open at this point. Way0utwest2 could close this and merge the changes, but it's bad form to close your own PR. Someone else should review the code. Some people even have rules in place to prevent a PR creator from closing their own PR and merging code.
Here, I'll go back to the sjonesdkranch user and close the PR, merging the code. When I click the "merge" button, I'll get a second button to confirm. When I do that, things merge, and I see the PR closed. I also see a note that the branch can be deleted. The original developer should do this.
Summary
This can be confusing, especially as I am switching users. Typically a developer is just one user and they see the comment and PRs created by others and deal with them. In this article, we showed how to do a basic PR that has no issues and merge it. While I used two users, typically you would either create the PR or merge it.
We also showed how to add comments and communicate with other developers. Doing this in GitHub allows the flow of the code to be documented and prevents information in email (or in person) from becoming fragmented. While you might discuss this separately from GitHub, any changes to code here ought to include some discussion in the PR.
I would urge you to get a friend or two and build a repo. Make some changes and commit these in branches, then open PRs and practice working on them. Using a basic Hello, World type codebase is fine, but for SQL Server, perhaps you want to slowly add tables, views, and stored procedures, allowing each person to make changes, comment on PRs and merge changes as appropriate.
Just don't forget to delete your branches.
In the future articles, we will look at how to make more complex changes on GitHub.