Today on the show we will be talking about the Batman and support rotations. The Batman or Batwoman is a real role and not just a superhero. In the context of software engineering and our day to day lives, the particular role of the Batman is that of a support role and a particular sprint. For example, suppose there were several stories but in their sprint, a lot of bugs had surfaced from the previous iteration. The Batman can be assigned to specifically just tackle those bugs and any support ticket. Their main role is just to do the unplanned work, such as the bugs that were planned from the previous iteration, or bugs that were introduced during that iteration as well. Stay tuned as we go more into the importance of the batman role and using strategies to prevent burnout and transfer knowledge to rotating support roles.
Key Points From This Episode:
- The role of a batman in the context of agile software engineering.
- How the ‘Batman’ or ‘Batwoman' supports the team by dealing with bugs during production.
- Different names and titles for the support role of ‘Batman’.
- Why sometimes the role doesn’t have a particular title.
- The importance of formalizing the role of ‘Batman’ to prevent burnout.
- Using paring strategies to transfer knowledge from the Batman to rotating supports.
- The importance of overlapping support during the transition period of the Batman role.
- Implementing a maintenance window if you are on pager duty.
- And much more!
Transcript for Episode 40. The Batman Support Rotations
[0:00:01.9] MN: Hello and welcome to The Rabbit Hole, the definitive developer’s podcast in fantabulous Chelsey Manhattan. I’m your host, Michael Nunez. My co-host today.
[0:00:09.8] DA: Dave Anderson.
[0:00:10.8] MN: And our producer.
[0:00:12.0] WJ: William Jeffries.
[0:00:13.3] MN: Today we’ll be talking about the Batman and support rotations. Now, I’m surprised I was actually able to say the Batman without the Batman voice. Batman. Dave, you do it.
[0:00:25.6] DA: It’s me.
[0:00:28.3] MN: Who are you?
[0:00:30.0] DA: Not the role you need, it’s the role you deserve.
[0:00:33.6] MN: There you go. I can’t. I have a horrible batman voice, I guess. But today we’ll be talking about the batman, the batman is a real role.
[0:00:42.2] DA: Not just a super hero.
[0:00:43.8] MN: Not just a superhero and –
[0:00:45.8] DA: Well, maybe a super in some ways.
[0:00:49.0] MN: Well not, yeah. I mean, I can go to an argument, a tangent from this podcast on why Batman is not a super hero, but that’s besides the point. We’re going to get a lot of haters.
[0:00:58.5] WJ: He’s not a super hero because it’s a military position.
[0:01:01.9] MN: Oh, right. Is that so?
[0:01:03.8] WJ: Yeah. Batman was the person who took care of a military officer’s at horse, which is a pack horse. Bat is actually French for pack and so the batman was a servant, it was a lowly role, but a noble one. Doing good work.
[0:01:22.6] DA: But super important. Helping out the officer who is a very important guy and he’s got, his time’s very important. You know, he doesn’t want to be fussing with his packs, he’s got the batman. It’s like a little bit less dramatic than Christopher Nolan’s.
[0:01:38.3] MN: The Batman. “I’m the Batman, let me go get your horse.” That’s not exactly the role they wanted to do it. But…
[0:01:46.7] DA: But like, in the context of software engineering and our day to day lives like, I think you brought this up first with James Shore’s Art of Agile, right? He has a whole section about the batman in the context of an agile team like what that role is.
[0:02:04.0] MN: Yeah, the batman in this particular example and agile software development is person’s particular role is that of a support role and a particular sprint, suppose there’s a lot of stories but in their previous sprint, a lot of bugs has surfaced from the previous iteration. The batman can be assigned to specifically just tackle those bugs and any support ticket.
[0:02:32.3] DA: Yeah, it can be kind of distracting when you’re trying to work on some really critical feature work when a bug comes up in production or a high urgency, you know, the suit comes down from upstairs.
[0:02:44.8] MN: Yeah, you’ve got to take care of that work when the suit comes down. It’s just no good.
[0:02:50.4] DA: Their main role is just to do the unplanned work I guess, right?
[0:02:54.9] MN: Yeah, it’s pretty much the unplanned work or the unplanned, planned work, right? Because it could be the bugs that were planned from the previous iteration but they also could be bugs that were introduced during that iteration as well.
[0:03:07.0] WJ: I’ve heard other names for this role as well.
[0:03:08.8] MN: What do you have?
[0:03:09.7] WJ: On a previous team, we called it the Bug Master and we actually had a staff, the bug master staff.
[0:03:15.4] MN: The Bug Master.
[0:03:16.8] WJ: Yeah. It was the same kind of deal where like, if something came up, whether it was a planned bug or an unplanned bug, the person with the bug master staff, everybody would point to them. “Oh hey, you have the bug master staff this sprint, can you go deal with this problem?”
[0:03:31.9] MN: Right
[0:03:32.9] DA: Was that like a dubious honor or people were like, “Yeah, I’m The Bug Master”?
[0:03:36.5] WJ: Well, originally nobody wanted to be the Bug Master and so we got a staff for it and that helped.
[0:03:42.9] MN: Yeah, items do help. A staff is amazing.
[0:03:46.9] WJ: It wasn’t like distinctive enough, we bought a finger puppet, we bought a couple of finger puppet bugs and it attached — I got a spider, it was like a really creepy looking spider that I put on the top of it. Nobody liked that. So we replaced it with a grasshopper finger puppet. It sat on the top of the staff and that was much more popular and so instead, I would use the spider to just troll people. I would throw the spider onto their keyboard while they weren’t looking and they would like look back and be like, “Oh god, there’s a spider on my keyboard.”
[0:04:18.6] MN: Did you just smashed the spider with the staff?
[0:04:21.9] DA: I feel like that would be pretty appropriate, yeah.
[0:04:23.9] MN: Does it break people’s keyboards that way I guess too.
[0:04:26.7] WJ: People, they hated me.
[0:04:28.2] DA: Yeah.
[0:04:28.8] WJ: I was not popular that week.
[0:04:30.9] DA: It’s rough. Yeah, I mean, I think I’ve had people, projects that I’ve worked on fill similar roles but sometimes it doesn’t have a title at all. Sometimes it’s more of like an on call support rotation kind of a deal.
[0:04:44.9] WJ: Yeah, if it isn’t formalized, you definitely need to make sure that it does get rotated because people will burn out really fast. There’s a tendency to put your most senior people on this, and that is a terrible idea.
[0:04:57.0] DA: Right, yeah. Because they have the most context about all of the system, they are able to very quickly go in and, you know, make the change that needs to get made and save the day.
[0:05:04.8] WJ: They’ll fix it faster than anybody else but then no one else will ever learn to fix it.
[0:05:08.3] MN: Right.
[0:05:08.7] DA: Yup, they’re your Brad.
[0:05:10.7] WJ: Or Brent.
[0:05:10.8] DA: Or Brent, not Brad. From the hit book, what is that called?
[0:05:17.8] WJ: The Phoenix Project.
[0:05:19.1] DA: The Phoenix Project. Yeah, your Brent is the one who is simultaneously the solution to and the cause of all the problems.
[0:05:29.2] MN: Oh, wow.
[0:05:30.5] DA: Because they just go in and they’re able to go into production and just fix all the issues and have all the context and no one else can replace them because they don’t have the knowledge that they need to get.
[0:05:42.9] WJ: Amazing job security.
[0:05:44.6] MN: Oh yeah. That is an underlying problem.
[0:05:48.3] DA: That’s like definitely a huge benefit of rotation too, because people will be in that situation and they will need to get that context or ask your go-to guy and then actually talk to him.
[0:06:00.6] WJ: Yeah, if you do have a Brent, Brent is usually a super nice guy who means well and if you have him pair with whoever is the batman that’s Brent then you can transfer some of that knowledge out of his head and make him less of a bus factor problem.
[0:06:13.8] MN: Right. That is important, you want to reduce or increase the amount of people who know how to fix that problem. I think I’m, just to clarify on the Batman in this particular case is that this is not like a person who is – this doesn’t necessarily mean that you’re on call 24 hours when you are the Batman. I know there may be places that what consider that support role to do that, but that’s not always the case 100% in the example that we’re providing, right?
[0:06:45.5] DA: Right, yeah.
[0:06:46.3] MN: It’s, your work day consists of trying to figure out these complex – the complexity of the bugs in the system or handle those bugs throughout the day. But that person may also do 24 hours, I’m not 100% sure? But that doesn’t, just because you’re the Batman doesn’t mean you will do 24 hours.
[0:07:04.4] DA: Right, yeah it’s very team specific.
[0:07:06.6] WJ: Or the Batwoman.
[0:07:07.8] MN: Or the Batwoman. Yeah, I think we should definitely say that.
[0:07:11.5] DA: Right, this is no longer the 1700’s you know?
[0:07:14.6] MN: Military terms in the 1800’s. Batwoman. I have to figure out the Batwoman voice.
[0:07:21.0] DA: Yeah, I think there’s like some challenges with this too, like defining the scope of the role, especially like with the support rotations, you know, and sometimes people want to have like collective code ownership over like, all of the code bases in the organization but if you don’t have enough context to support that, it can be like really anxiety-inducing to have to be responsible for any kind of problems that are outside of your domain of expertise.
Yeah, making sure that there’s like, some context for what kind of things were happening last week. So I’ve seen overlapping support rotations or overlapping time for having a backup to help out during the transition, that really helps as well.
[0:08:06.0] MN: The thing I’ve seen before is, we didn’t call it support or the Batman or anything of that nature. We just called it the buddy system and it was you and a buddy would team up and be the support for the team. The buddy system had two individuals from two separate teams that like, similar product like some of the facing product.
An example would be like the address service and the account service. Like one individual from each one of those teams would pair up and handle bugs that were introduced to both sides and speak to the product owner to figure out which one was more important. But the buddy would then pair program with the other individual to know how to fix those bugs and then that way then all the transfer happen that way as well.
Across team as well as within the team. So it’s like cross team and within the team at the same time. But the buddy system was like, I think William had mentioned before, you want to pair with someone especially if the batmen or women isn’t the one with all the contextual knowledge that way that you can knowledge transfer that way as well. So pairing is really important like useful in this particular support role case.
[0:09:24.1] DA: Yeah, it’s an argument for the bat people.
[0:09:26.5] MN: The bat peoples, I’m just really bad at the Batman voice. I don’t know who’s brought up before but one of the reasons why bat men or women is important to the team, it’s because while this one person is designed or this person’s role is to handle bugs and support, it helps increase velocity because not everyone on the team has to then code switch to fix bugs. You just have this one or a pair of people handling those bugs.
[0:10:00.0] DA: Yeah, it totally makes sense like if there are unplanned things coming up then you can’t meet your commitment for the sprint and it’s demoralizing, you know? It’s good to have someone there to take that hit.
[0:10:14.6] MN: Right, so you are doing a good service batman or woman so keep it up. I think it may be overall that it doesn’t get a lot of praise, but it is pretty awesome when you’re the individual who’s swooping in and fixing all of those bugs and making sure that the team is still rolling well.
[0:10:33.5] WJ: So when you are the bat person are you working nights?
[0:10:37.4] MN: I think it depends. I think pager duty is a thing that I have heard before in time.
[0:10:42.6] DA: Yeah, it’s a pretty good tool to help manage those kinds of rotations.
[0:10:46.3] WJ: Yeah, I think that the two roles are not necessarily the same. You could be on pager duty and not be the batman or woman, but it does often make sense to just have that be the same person. Especially if you have already set up a system whereby people can easily contact the bat person it’s just going to –
[0:11:07.7] DA: You already have your bat signal ready to go?
[0:11:10.1] MN: Exactly. It’s just 911 text to batman, please help. Help.
[0:11:16.1] WJ: Yeah pager duty is a great tool. I mean it sucks to be on it and get caught in the middle of the night, but the tool itself is very flexible.
[0:11:26.0] DA: Yeah.
[0:11:26.4] WJ: Have you guys worked with it before?
[0:11:28.1] MN: I haven’t had the opportunity to work in pager duty. I mean, when I was on call I wasn’t pager duty. It was just like, “Hey keep your phone on because you may get calls.”
[0:11:39.7] DA: Yeah just like a regular call from a person.
[0:11:42.2] MN: Yeah, it wasn’t an app per se. But I mean it is a good way to mitigate any support, any situations that happen throughout the night. I’ve just been lucky enough where I haven’t been assigned that person.
[0:11:54.4] DA: Yeah, I’ve had the pleasure of working with pager duty, but I only say pleasure because I managed to get through my rotations without getting paged a single time.
[0:12:02.1] MN: Oh nice.
[0:12:04.4] DA: But, you know, the very specific like on call responsibility which may or may not be part of this Batman role. That can be pretty hit or miss. Like I definitely know people who are getting consistently woken up at 4 o’clock in the morning for not very good reasons.
[0:12:21.1] WJ: You know, you can set maintenance windows on pager duty. It is a super handy feature. If you’re doing some kind of job or something that you kind of expect that’s going to fail in the middle of the night and really it makes more sense for someone to just deal with this in the morning. You can set a maintenance window and then it won’t go off until the end of the maintenance window.
[0:12:39.6] DA: Yeah that sounds good. I’ll just set those all the time.
[0:12:41.9] MN: Yeah, sleep around me, you know? Exactly, does maintenance all the things all the times.
[0:12:47.8] DA: Yeah so that’s always a work in progress.
[0:12:50.0] WJ: We used it for a client for load testing.
[0:12:53.7] DA: Cool.
[0:12:54.9] MN: Yeah, I think just like the idea of the Batman who’s now responsible of making sure of that the velocity stays at peak while continuously smashing bugs and supporting the rest of team is an important role to have on a team and an important person to be, so.
[0:13:16.3] DA: Yeah, it’s a good team player in helping everybody do the good work.
[0:13:20.0] MN: Exactly. So it’s not, as Dave mentioned earlier in the episode, it’s not the role you want, it’s the role you deserve or whatever the case maybe because the team deserves the hero to come in to ensure that things go well, as well as possible. So continue being batmen and women out there. Cool, do we have any teach and learns we want to discuss today?
With GraphQL, when you ask for something you only get what you want and what you asked for. It’s more like writing a sequel query in that way but it is strongly typed and it is very structured actually. So through introspection, it’s usually possible to generate a scheme of file from your server and an Aquarius file from the client and through the simple act of linking these two files against each other you can have a really nice simple contract test. So I was reading about that recently and that’s pretty neat.
[0:15:07.1] MN: That’s like I imagine it a much more elegant way than hike the record because you’re just dealing with a Jayson file that you are requesting and you get only what you want not like these crazy SQL related queries that you would have to run.
[0:15:25.0] DA: Yeah and the cool thing about having the schema and these queries that you can compare against the schema is that if the server changes in a way that would break the client or if the client changes in a way that would not work with the server, then you have a pretty early warning about that before you push the prod.
[0:15:42.9] MN: Interesting. Cool and that was Apollo you said?
[0:15:45.6] DA: Yeah, Apollo and GraphQL. Apollo has tools that make those kinds of things happen easier, but there are a whole bunch of clients out there and GraphQL is a state of mind. That’s like a philosophy or a specification.
[0:16:00.1] MN: Awesome, sounds good. Cool, wrapping up the show, I’d like to thank my co-host, Dave. Always good having you down.
[0:16:09.2] DA: It’s a pleasure.
[0:16:10.0] MN: And our producer, William, thanks for stopping on by.
[0:16:13.1] WJ: Great to be here.
[0:16:13.9] MN: Feel free to hit us up at Twitter.com/radiofreerabbit. This is The Rabbit Hole, we’ll see you next time.
Links and Resources: