Git Rebase Tips and Tricks

I don’t use a rebase flow on every project, but when I do, here are the habits that help keep things smooth.

The Basic Command

I use this formulation of the git rebase command

             (1) The commit where we branched from
              |
              |           (2) The branch we want to rebase on top of
              |            |
              |            |              (3) The flag to keep merge bubbles
              |            |               |
           .------. .-------------. .-------------.
git rebase old_base --onto new_base --rebase-merges
  1. By providing the old_base explicitly we avoid scenarios where git gets confused1.
  2. The branch we want to replay all of our commits on top of.
  3. This keeps the empty commits that create merge bubbles.

Keeping merge bubbles is seemingly another contentious topic but I find them valuable. For example with this listing using git lol2 I can see that feature1 was potentially less complicated than feature2 because it required less commits.

*   cb08e97c1a Merge pull request #2 from feature2
|\
| * a5c310e392 Implement feature2
| * e07178d052 Refactor to make feature possible
|/
*   3fe7557433 Merge pull request #1 from feature1
|\
| * 07b845a110 Implement feature1
|/
*

* The branch names/commit messages in these examples are not good examples of naming/describing but I’ve kept them short to keep the example small.

This view also allows me to know what commits would need reverting if I want to back out a feature.


Verifying the Result

Clean Rebase

If the rebase was clean and there were no conflicts that I had to resolve, I tend to verify that the result is good by diffing between my local branch and the remote. For this I have another alias git dfr3 (short for diff remote). A successful result would essentially just contain a diff showing the changes that went into the base branch after the point the current branch was forked.

This breaks down when rebasing a branch that has gotten quite out of date with the new base. Keep in mind that the diff includes all the changes that went into the new base branch. This can produce a lot of output, and if you weren’t the one who made those changes, it can be tricky to reason about.

Rebase that had merge conflicts

When I’ve had to resolve merge conflicts during the rebase the above diff isn’t very helpful because the changes dealing with the merge conflicts are mixed in with the changes that went into the new base branch. To get a better view of things I reach for git range-diff old_base..old_head new_base..new_head.

What this command does is it tries to find the same commit in both ranges using heuristics like commit message. It then creates a diff between each pair of commits. The output of this command is a little hard to read because there are potentially two levels of ± indicators in the gutter. Persevere and it will make sense especially if you have coloured output in your terminal.


Fixing when it goes wrong

Using the verification steps above, I sometimes discover that I’ve messed up a merge conflict. I’d rather try and fix the broken commit itself over adding a new commit. To achieve this I reach for an interactive rebase following these steps:

  • Find the SHA of the parent for the broken commit
  • Run git rebase --interactive parent_sha --rebase-merges
  • In the text editor find the sha I want to edit and change its option to e/edit
  • Follow the usual process of a rebase to step through the reapplication of commits

If you’ve ever used git rebase -i before you’ll notice that adding the --rebase-merges flag really steps up the difficulty level. For simple edits it’s easy enough to just ignore the commands like label, reset, merge etc and concentrate on the normal options you may be used to.


Starting again

Sometimes stuff really goes wrong and it’s time to admit defeat (for now) and call git rebase --abort. Even after years of rebasing, I still end up here a lot. It’s not a sign of failure - usually it just means the first attempt went wrong and I’ve learned what to do differently next time.


Pushing changes

I always use the --force-with-lease flag when pushing changes as it’s slightly safer than plain --force. Essentially --force-with-lease bails out if git notices that your copy of the remote is not up to date. This reduces the chances of you clobbering someone else’s work because you’d need to do a git fetch to get the latest changes and then resolve any conflicts locally.


Preventing pain

To reduce painful rebases it’s good practice to rebase early and often. The longer you leave branches to diverge the more chances you have of getting conflicts so integrating as early as possible is beneficial.


Conclusion

The tips and tricks above have taken years to figure out. As well as knowing the commands I think practicing and trial/error are really the only way you get better at this stuff so don’t be afraid to get stuck in.


  1. When doing a rebase flow if you get a bit behind you can find yourself needing to rebase branches on top of bases that have themselves been rebased. In this case git tends to pick the wrong base because it has to find the first commit in common. This can result in duplicate commits ending up in the output and potentially more merge conflicts to handle. 

  2. The alias is configured in ~/.gitconfig in an [alias] section

    [alias]
        lol = log --graph --decorate --pretty=oneline --abbrev-commit
    

  3. The alias is configured in ~/.gitconfig in an [alias] section

    [alias]
        dfr = !sh -c 'git diff origin/$(git symbolic-ref --short HEAD)..$(git symbolic-ref --short HEAD) "$@"'
    

Sometimes a Great DX Is Just a Horrible Pop Up

Great developer experience isn’t always about polished UIs sometimes it’s as simple as surfacing the right problem at the right time. One small feature I added to our tech estate that always makes me smile when I use it is a horrible global pop up. Here’s what lead me to build it.


The Problem

I work on a mobile app team that owns parts of the backend estate. We often need spin up multiple web services locally and point our simulator at them. We’ve built a GUI that reduces all the complexity of configuring these services down to a few play buttons.

This worked well for years until we started needing to talk to various services using OAuth. To boot these services we now need to obtain a JWT and inject it into the container.

Obtaining the JWTs is easy as our delivery platform team provide helpful tooling. Where we hit pain is that developers are used to starting these services and forgetting about them, potentially having them run for days. This doesn’t work so well now with the JWTs having short expiry times meaning that services start to fail in interesting ways that may not be super obvious to debug.


The Solution

I solved this by adding a global pop up that shows which service’s token has expired and provides a restart button. I could have automatically restarted services but it feels more user friendly to let developers decide when this is going to happen. It looks like this:

An example of the pop up discussed in the post

* The image capture software I used makes this pop up look novelty sized, it’s not this big in reality but it made me chuckle so I’ve left it in.

Design notes on the behaviour of the pop up

  • It sits on top of all windows
  • It can be dismissed but comes back if you activate the application again
  • If more than one service needs restarting it offers a single button to restart them all

How does it work technically?

I’m glad you asked… as the services are all running in docker we can run docker inspect <container-id>, which returns a load of helpful information about the running container. Nestled away under Config > Env is all the environment variables that were provided to run the docker container. We can grab the environment variable that contains the JWT and then decode the payload to get access to the expiry information.

Our tooling is built using Kotlin Compose Desktop so these code snippets are for Kotlin. We start by declaring a @Serializable type that represents the payload and use a handy custom serializer to get the expiresAt value in the right format.

@Serializable
private data class OAuthPayload(
    @Serializable(with = DateSerializer::class) val expiresAt: LocalDateTime,
)

With this type declared we need to unpack the JWT, decode the base64 value and then parse the JSON body.

val (_, payload, _) = it.split(".")
json.decodeFromString<OAuthPayload>(payload.decodeBase64String()).expiresAt

At this point we know when the token expires and we can just schedule for the UI to appear at the right time.


The Result

As a developer I always get a sense of satisfaction when I tap the button and it automagically gets me a new token and restarts my service. I knew the pain of manually getting a token and then trying to remember the right docker commands to stop the running service, export my fresh new JWT then restart the service so I enjoy seeing the friction removed.

As someone supporting other developers, I love that I can’t remember the last time someone came to me that had been tripped up by expired tokens.


Conclusion

Not everything has to be an engineering master piece it just has to solve a problem. These OAuth tokens were a real usability issue for developers who are mainly focussed on the front end, where running backend services locally is more of a secondary concern. The best developer tools don’t just automate they anticipate where developers might trip up and quietly save the day.

Chill Out with the Defaults

I predominantly work in Swift and Kotlin, both of which support default arguments. As with any feature it’s worth being careful as overuse can lead to unexpected design trade-offs.

A common pattern I keep seeing in various codebases I work on is that data transfer objects are being defined using default arguments in their constructors. I think this leads to a few issues that I’ll explore in this post.

A Simple Example

Here’s a typical example of a class with a default argument on tags.

struct BlogPost {
    let title: String
    let tags: [String]

    init(title: String, tags: [String] = []) {
        self.title = title
        self.tags = tags
    }
}
data class BlogPost(
    val title: String,
    val tags: List<String> = emptyList()
)

It’s not always clear why this is done. I suspect it’s often out of habit or convenience for testing. My suspicion that this is related to testing comes from seeing this pattern repeatedly even though the production code is explicitly providing every argument and therefore never makes use of the defaults but the tests do. It gets my spidey senses tingling when it feels like we are weakening our production code in the service of adding tests.

The unintended consequence of these defaults is that the compiler can no longer be as helpful.


Exhaustivity Tangent

Just to make sure we are on the same page let’s talk about exhaustivity checking with enums as hopefully people will have experience with this. If we declare an enum and later switch over its cases the compiler can check to make sure we cover every case (assuming we don’t add a default case.)

For example let’s start with a PostStatus with two cases

enum PostStatus {
    case draft
    case published
}

func outputFolder(status: PostStatus) -> String {
    switch status {
        case .draft: "/dev/null"
        case .published: "blog"
    }
}
enum class PostStatus {
    Draft,
    Published
}

fun outputFolder(status: PostStatus): String {
    return when (status) {
        PostStatus.Draft -> "/dev/null"
        PostStatus.Published -> "blog"
    }
}

If we add a third case of archived then the compiler will force us to revisit our outputFolder function as it’s no longer exhaustive:

enum PostStatus {
    case archived
    case draft
    case published
}

func outputFolder(status: PostStatus) -> String {
    switch status { // Error -> Switch must be exhaustive
        case .draft: "/dev/null"
        case .published: "blog"
    }
}
enum class PostStatus {
    Archived,
    Draft,
    Published
}

fun outputFolder(status: PostStatus): String {
    return when (status) { // Error -> 'when' expression must be exhaustive. Add the 'Archived' branch or an 'else' branch.
        PostStatus.Draft -> "/dev/null"
        PostStatus.Published -> "blog"
    }
}

This is great because it means the compiler will guide us step by step through every callsite so we can decide what the appropriate action to take is.


Missing Exhaustivity 😢

If we agree that exhaustivity checking is a good thing then we can extend the same logic to the first example. Let’s say we create instances of our BlogPost type in a few places in our codebase and we want to add a new property of isBehindPaywall. If we add a default value then the compiler doesn’t help us by highlighting all the callsites that we should reconsider. If we are lucky then we make the change and all is fine, if we aren’t so lucky then we could accidentally have blog posts being hidden/shown when they are not supposed to be. In this case I’d much rather the compiler makes me check every callsite so I can make the correct decision.

In practice all this means is that I have to explicitly specify the isBehindPaywall argument and accept that there might be some duplication:

// Explicit
BlogPost(title: "Some blog post", isBehindPaywall: false)

// Implicit
BlogPost(title: "Some blog post")
// Explicit
BlogPost(title = "Some blog post", isBehindPaywall = false)

// Implicit
BlogPost(title = "Some blog post")

Local Reasoning

The explicit version above has another strength to it that is due to the improved local reasoning. If I want to know how the isBehindPaywall state was decided I can simply look at the callsite that instantiated the instance. In the defaults case this isn’t as simple - first I need to look at the callsite then if no value was provided I need to look up the declaration of BlogPost. With IDEs that click through this might not seem like a hardship but it also means we are vulnerable to changes being made at a distance e.g. someone could change the default value and it could have wide ranging side effects to any callsite that didn’t explicitly add a value.


Discoverability

You might think it’s fine when I add the property I’ll go and check every callsite myself manually. This is all well and good if you are the sole owner of the codebase and if you aren’t publishing your code as a library. But bear in mind it’s not a permanent fix as anyone can come along and create an instance of BlogPost and they may or may not see the isBehindPaywall option, which means they could get it wrong.

The issue is when people come to create instances of our BlogPost type they can get away without providing a value for isBehindPaywall and be blissfully unaware it even exists or its impact e.g.

BlogPost(title: "My Blog post")
BlogPost(title = "My Blog post")

Respecting Boundaries

Another subtle issue is whether something like a data transfer object should even have this knowledge bestowed upon it or if some code that has the business rules should be in charge.

Consider this scenario I had recently:

I have a backend service that supplies data for Android and iOS clients. The backend uses kotlinx.serialization, iOS uses some legacy JSONSerialization code and Android is using Gson.

       +---------+
       | Backend |
       | kotlinx |
       +---------+

+---------+  +-------------------+
| Android |  |        iOS        |
|   Gson  |  | JSONSerialization |
+---------+  +-------------------+

With this setup we have 3 different code bases that are bound by an informal contract of what the JSON should look like. Each platform is using different libraries to encode/decode and could have subtle differences in how this is done.

We also have a Kotlin Multiplatform library so it makes sense to refactor like this:

  • Extract the code from the backend service to the Kotlin Multiplatform module
  • Utilise this kotlinx.serialization based code from all 3 places
       +-----------------------+
       | kotlinx.serialization |
       +-----------------------+
            ^      ^      ^
           /       |       \
          /   +---------+   \
         |    | Backend |    |
         |    +---------+    |
         |                   |
       +---------+    +---------+
       | Android |    |   iOS   |
       +---------+    +---------+

We still have 3 code bases that have to agree on how the JSON is structured but now that is handled in a more concrete way by providing the type and the encoding/decoding logic in a shared library.

With this refactor originally the type that lived in the backend service had default values encoded into it but with this new split it doesn’t really make sense. The Android/iOS clients are supposed to be totally dumb and just trust the data handed to them but the original type knows too much with defaults being baked in. It makes much more sense to strip the defaults and keep the business rules on the server populate these values, which means that the type is a simple as possible.


Conclusion

It may seem like I’m beating on default arguments but in reality I use them all the time. My main point is before adding defaults, ask what you might lose. Sometimes, explicit arguments add a little duplication but make your code safer, more discoverable and easier to reason about.