VIDEO
Tracy Stampfli (Senior Staff Engineer at Slack) talks about modernizing and refactoring code, with lessons drawn from an overhaul of the mobile codebases at Slack.
Like what you see here? Our mission-aligned Girl Geek X partners are hiring!
- See open jobs at Slack and check out open jobs at our trusted partner companies.
- Watch all Elevate 2021 conference video replays!
- Does your company want to sponsor a Girl Geek Dinner? Talk to us!
Transcript
Sukrutha Bhadouria: Tracy’s a senior staff engineer at Slack. Prior to Slack, Tracy worked at Adobe for many, many years, specializing in client networking and support for streaming audio, video, and screen sharing. Tracy lives in San Francisco with her husband and two kids. Welcome, Tracy.
Tracy Stampfli: Yes. My name is Tracy Stampfli. I’m a senior staff engineer at Slack. I lead the iOS infrastructure team. iOS infrastructure basically handles things like networking, and data syncing, and all the stuff that’s a bit under the hood of the app, supporting the UI and the features.
Tracy Stampfli: I’m going to talk about modernizing mobile codebases, but I’m really going to talk about modernizing and refactoring codebases in general. I hope that a lot of this talk is actually generally applicable and not just about mobile.
Tracy Stampfli: I’m going to talk about some things we’ve learned from doing a rewrite, or at least a partial rewrite, of both our iOS and Android codebases at Slack. Why did we think this was necessary? What decisions did we make about how to rearchitect our code? What has made this successful, at least so far? It’s still in progress.
Tracy Stampfli: Why did we decide to do this? Well, Slack’s mobile codebases had not really had a big rewrite or refactor in a long time. Slack’s been around for about seven years, and on iOS, we hadn’t actually done a refactor in that entire time, or a major refactor. Android had done one about five years ago, but that’s still a while.
Tracy Stampfli: Both of these codebases had a lot of tech debt. They had a lot of obsolete design patterns that didn’t match up with how we do things now. There were a lot of inconsistencies, you know, five different ways to do the same thing.
Tracy Stampfli: That made it really confusing for new engineers and hard to onboard, because it was hard to figure out what the right way to do things was. The code was fragile and too tightly coupled, which made it easy to inject bugs.
Tracy Stampfli: All of these things added up to slow down feature development. That became a really big problem. Mobile development was starting to be a drag on feature development in general at Slack. That was something we knew that we had to tackle and do something about.
Tracy Stampfli: We decided we really needed to prioritize addressing this tech debt, and invest in more modern architecture, and improve our development practices to speed things up.
Tracy Stampfli: Where deciding how to do this, there are a number of options. We’re going to do a big refactor, so how are we going to do this? We considered a few things. We could do just a full rewrite of our mobile apps. There’s some nice aspects of that. You get to just toss out the legacy code and start afresh, new project, just start going with new patterns, do it the right way that you’d like to rewrite all the features.
Tracy Stampfli: But obviously, there’s some risks with this as well. If you’re going to start from scratch and just fully rewrite the app, you have to bring it all the way up to feature parity with your existing code. For a product like Slack, that’s been around for a long time and has a lot of features, that’s going to be a really massive task.
Tracy Stampfli: While you’re doing that, you have to maintain two codebases. You have to keep the new codebase … There’s going to be feature development going on, so you have to be doing dual feature development in the old codebase and the new codebase.
Tracy Stampfli: Overall, this is just a very big, risky bet. What happens if you don’t ship that new, modern codebase for some reason? You’ve wasted a huge amount of work. We decided this was a bit too risky for us.
Tracy Stampfli: We considered some other options. When you talk about speeding up mobile development, one option that comes up a lot is sharing code. There’s a number of ways to do this. We could share code between iOS and Android. We could share code between the mobile apps and the desktop. There’s different frameworks for sharing code. You can share UI code or business logic. Slack actually tried to do some shared business logic a few years ago across the clients. It didn’t actually work out too well.
Tracy Stampfli: There are, again, some benefits here. Obviously, there’s this benefit of not rewriting the same feature or the same logic three times, or however many clients you have. You can also, hopefully, maybe share some developer resources across platforms if you go this route.
Tracy Stampfli: But there’s also, again, downsides. Shared code complicates your tooling. It makes things like just debugging, building the CI system … All of those things become more complicated. Also, the big danger is that, if you share code, you might lose native look and feel or native performance. This is a big issue for mobile.
Tracy Stampfli: We really want to take advantage of the latest and greatest features that each of the platforms has to offer. We want our iOS apps to feel like great native iOS apps, and same with Android. We want performance to be great on our mobile platforms. This was something that we were definitely worried about.
Tracy Stampfli: Also developer sentiment about this idea was just not very good. Our developers were just not excited about doing shared code. If the developers aren’t excited about it, it’s unlikely to be successful.
Tracy Stampfli: What’s left, basically? Well, you could refactor your existing codebase in place, essentially rebuild the airplane as you’re flying it. This is the option we ended up going with. There’s some, again, benefits and downsides to it.
Tracy Stampfli: Benefit. Reduced risk. We were basically having to refactor and ship this codebase continually. We just have one codebase, so at any given point in time, it always has to be shippable, because we have to keep releasing it.
Tracy Stampfli: We get faster payoff, because as we’re modernizing that codebase, it is continually being improved, and everyone working on the team is getting the benefit of that improved codebase as we go along. We don’t have to do this dual codebases, try and develop things in both thing, which seems very challenging.
Tracy Stampfli: But it does mean that we don’t get to just get rid of our legacy code. We have to actually deal with it. We can’t get rid of the tech debt. We have to migrate and rearchitect our existing codebase.
Tracy Stampfli: We launched this thing called Project Duplo. You may have noticed, there’s a Lego Duplo theme to this presentation. That’s because, as I’ll talk about a bit more later, one of the big themes of Project Duplo was modularization, or breaking down the codebase into smaller building blocks. Duplo are the larger Lego bricks, so yes, basically a big theme about breaking down the app monolith.
Tracy Stampfli: This was a big rearchitecture of both of our mobile code basis that we launched a little while ago. It was coordinated across both mobile platforms. iOS and Android came up with proposals together, did an investigation together, scoped this effort together. We’re now running the project together. This was all a coordinated effort.
Tracy Stampfli: The goals were, number one, improved developer velocity. Start shipping features faster. But also adopt modern design patterns, bring some of the patterns in our app to more in keeping with current mobile development practices.
Tracy Stampfli: And really enable larger teams. Obviously, the mobile teams of Slack have grown a great deal since the company started. We hope they’re going to grow a bunch more. The patterns that work for smaller development teams don’t necessarily work for a team of 40, and what works for a team of 40 may not work for a team of a hundred. So we really wanted to enable our growing development team.
Tracy Stampfli: And also set us up to adopt future technologies we might be interested in. Like right now, we aren’t actually using SwiftUI, but we might want to in the future. We want to have an architecture that keeps that door open, and allows us to have that possibility.
Tracy Stampfli: This project had three phases. The first phase we called stabilization. The idea here was to complete a bunch of ongoing migrations and refactors that we already had in flight, things that we’d started, but then didn’t have the resources to finish. Again, that was really leading to inconsistency, so we wanted to remove the worst of the tech debt, the worst of the anti-patterns, and just clean the codebase up to set us up for the rest of the project. This phase had very well-defined work streams and really clear metrics for success, so we really could track our work very well.
Tracy Stampfli: I’m not going to get too deep into the details of what we actually did on each platform, due to lack of time. But this is just some highlights of some of the top goals on each platform.
Tracy Stampfli: On iOS, we wanted to move to being 100% Swift. We were already about 80% before this project started. And we wanted to finish some migrations onto some of our infra frameworks. Similarly on Android, you’ll see these common themes of finishing migrations, finishing adoptions of different patterns and different frameworks, and breaking down some of the infra frameworks to be more usable.
Tracy Stampfli: The second phase here was modularization. Here, we’re getting into the building blocks. The idea here is that we had already somewhat modularized our code. We had on both iOS and Android. We had some code that was split off into frameworks, but we wanted to go a lot further in that direction, because there’s a lot of benefits to this.
Tracy Stampfli: By breaking the code apart into smaller frameworks, and breaking up that big app target, you reduce interdependencies, you remove this tight coupling between features that can make things more fragile and introduce bugs. You enable …
Tracy Stampfli: You have to have separation concerns, because you’re actually separating the code out. This enables developers to work better, independently from each other, and also improves the build times, because if you’re building your feature in a framework, as you make changes, you only have to rebuild that framework and not the entire app target.
Tracy Stampfli: The third phase was modernization. This is the one where, again, we’re really trying to look forward and think about what architecture patterns do we want to adopt that will make us compatible with current industry trends, but also will set us up for the next five years of app development.
Tracy Stampfli: We got into this project. As we got started and were going through the stabilization phase, we realized that actually this whole three phases thing wasn’t such a great idea. We realized that we should actually combine our three phases back down into two, and combine the modularization and modernization phases, because we realized that, if we’re doing these separately, you would refactor some code to split it out from the app, then modularize it, and then you would refactor it again to modernize it. That was going to entail a lot of wasted effort. To avoid refactoring things twice, we combined these phases and just made it a two phase project. So now it’s just stabilization and modernization.
Tracy Stampfli: As we were deciding what we were going to do for our modernization, and trying to decide what patterns we should pick, we did a lot of research with our developers, talking to our developers, doing focus groups, doing polling, trying to figure out what are the current pain points, and what did developers want to see out of this project. What did they really want us to do?
Tracy Stampfli: What we heard from them was interesting, because we didn’t hear that they felt super strongly about one particular feature architecture, or exactly how we did dependency injection. What we heard from them was that they wanted us to increase what you see here, CPR is the acronym we came up with, consistency, predictability, reliability. That was what was important to them, improving these things.
Tracy Stampfli: Consistency. Making it so that all the features all are built in the same way. So it’s easy to tell, if you look at another feature, that is built the same way the other ones are, the ones that you have built. So you’re familiar with the code. It all works the same way.
Tracy Stampfli: Predictability. Making things like routing and deep linking understandable. Making sure that the code actually works the way you think it’s going to work. Removing unintended effects where making a change somewhere in the product for some reason breaks something somewhere else.
Tracy Stampfli: Reliability. Again, making the code less fragile, so we’re spending less time on regressions and incidents and things like that.
Tracy Stampfli: So this was really what we wanted to focus on. How do you do that? Well, we realized that if we wanted to increase consistency, we couldn’t just do that by coming up with some feature architecture pattern and expecting every developer in the team to implement it exactly the same way. You can’t just hope that it’ll work.
Tracy Stampfli: You have to enforce these things through things like templates, and linting, and code generation, and basically making it very easy for developers to know what the right thing is to do, and to do that, and much harder for them to do things that we don’t want to do, and re-introduce inconsistencies and anti-patterns. If we’ve decided we don’t want developers to use singletons, we need to add a linting rule that actually prevents that, versus just hoping that they won’t.
Tracy Stampfli: What did we end up deciding to as our main goals for modernization? Again, not going to get very deeply into this, but on iOS, this big push to break down the app into, again, service and feature modules. So increased modularization. We did decide to adopt a new feature architecture that’s VIPER-like. We’re adopting Combine.
Tracy Stampfli: This last one is actually pretty important. We’re switching to using Bazel, which I don’t know if folks are familiar with it, but Bazel is a build system that you can use in place of Xcode or alongside Xcode. It deals better with highly modularized projects, projects where there is many, many frameworks. Bazel handles that pretty well. It also has a build cache, which means, hopefully, again, we’re going to see improved build times, both locally and in CI, by having a better caching of build artifacts.
Tracy Stampfli: On Android, again, similar themes. Completing modularization on that side as well. Adopting some new frameworks and libraries to do things like networking, and JSON parsing, and configuration changes. And building abstractions around startActivity and onActivityResult.
Tracy Stampfli: Why has this project been successful, or has it been successful? Well, we started off with the stabilization phase. All goals of that phase were completed on time for both platforms, 100% completed everything. So that was certainly successful.
Tracy Stampfli: We are now in the middle of the second phase, the modernization phase. That’s still in progress, so there’s a bit of a caveat here. We’re on track, but there’s still a ways to go. But we’re successful so far.
Tracy Stampfli: Why has this been successful? Number one, it was an engineering-led initiative with executive prioritization and resourcing. I think both of these things were really important and really key to the success.
Tracy Stampfli: This has really been an engineering-driven, IC-driven driven project. It was engineers who came up with the initial proposals, who did the research, who figured out what the problems were and how that we should solve them, and who came up with all of the scoping and projects and proposals that have led to what we were actually doing as part of the project. So very much not a top-down project.
Tracy Stampfli: It was very much driven by engineers. But it had executive prioritization and resourcing, which is equally key. This isn’t something that you can do as a side project. It really needs dedicated resources. It needs executives to sign onto the fact that we’re going to do this in place of doing something else, in place of doing feature work or whatever. So we have to have that buy in that this is the right thing to do, and we’re going to get the resources to do it properly.
Tracy Stampfli: Along with that, sponsorship from key engineering leaders outside of mobile helped us get that buy in. We had principal engineers and key managers who were willing to say, “Yes. This is super important. This is the right thing to do. We need to do it now.” That helped us get this backing that we got to actually do the project correctly.
Tracy Stampfli: Additionally, splitting the project into a couple of phases was really helpful, because having this initial stabilization phase, we just knew what we had to do. We were getting rid of tech debt. Things were really well-defined. We just had to execute on it. That gave us time to do R&D for the later phase.
Tracy Stampfli: So during the stabilization phase, we were able to do the investigation, and the prototyping, and figure out what should we do for modernization, and make sure that we had all of those things set up, and all of our scoping and all of that by the time we hit the modernization phase.
Tracy Stampfli: Finally … This one is very, very important. This has been a very metric and data-driven effort. We really wanted to have very clear metrics to measure our progress, what we’re doing, how successful we’re being. It wasn’t just tracking with Jira or whatever. We were actually running scripts on every file in the codebase to track, how are we doing with these migrations, what progress have we made, where are we. We had a lot of dashboards and graphs.
Tracy Stampfli: This is an example of one of … For each of the goals that we had, we would have a dashboard saying where are we with this in terms of progress on this measure. That allowed us to do things like see if we were falling behind or we’re doing fine. We could switch resources between different work streams if we needed to, if one of them wasn’t doing as well.
Tracy Stampfli: On this particular graph, you could see it doesn’t quite go to zero at the end. That was because we had ported the code to Swift, but some of it was still behind feature flags, as we were rolling out those feature flags when this snapshot was taken.
Tracy Stampfli: But basically, this was very, very helpful for the initial phase of the project. We are continuing that in the modernization phase. We’re really trying to come up with very clear metrics for each of our goals, so that we can track it and see that we are being successful. That enables us to do resourcing planning and all sorts of things that are very, very helpful for the success of the project.
Tracy Stampfli: That is the end of my presentation. If you have … I don’t think we’re going to have time for questions, but feel free to reach out to me. I’ll try to respond in the chat if folks have questions. Feel free to connect with me on LinkedIn if you want.
Tracy Stampfli: Also, as other folks have mentioned, Slack is also hiring. We’re hiring on the iOS team. We’re hiring on Android. We’re hiring on other teams as well. If you’re interested, please check out the careers page at Slack.
Like what you see here? Our mission-aligned Girl Geek X partners are hiring!
- See open jobs at Slack and check out open jobs at our trusted partner companies.
- Watch all Elevate 2021 conference video replays!
- Does your company want to sponsor a Girl Geek Dinner? Talk to us!