Vladimir Kiselev ⋮ Writing

On how to measure the developer productivity.


Photo by David Yu from Pexels

The other day I was listening to the podcast by Microsoft research (New Future of Work: How developer collaboration and productivity are changing in a hybrid work model — https://www.microsoft.com/en-us/research/podcast/new-future-of-work-how-developer-collaboration-and-productivity-are-changing-in-a-hybrid-work-model/)

A note — By the way, it’s a very interesting podcast by itself, my commentary is 100% optional and is only based on observations

One of the topics mentioned there is developers’ productivity, so I wanted to focus on that just to share some thoughts.

The model

It’s easier to evaluate the metric if you think of it as a property of an imperfect model. A value in a dimension that represents one of the qualities.

Let’s explore the example of a model and a property of it.

An airplane gets off the ground. You need to explain why.

  • The physicist will probably model that will explain the takeoff in terms of Newton laws, The Bernoulli Principle, the airfoil (the shape of the airplane’s wing that helps to lift the plane), etc.
  • The economist will probably just say that it’s because it makes sense to move the plane from A to B, maybe just using money to measure it: the price of an output or potential output given the risks involved is greater than the input. Literally, the folks bought them tickets, so the plane will fly (I use this as a joke often to explain the difference between the models).
  • The space era engineer might say that even the grandma’s old drawer can fly well given we have the right engine, and it is able to change the vector of the force applied.
  • Chemistry-based explanation will probably focus on the fuel and physics as well

All these explanations are good I think, they just use different models that might be able to help those using them to learn something worth fixing, but it is dangerous to fix something only based on one-sided view, especially if the model itself has just a few measurable properties. The simpler the model, the smaller the set of properties to measure, the harder it is to be sure that changing the thing it is modelled after will not cause a disaster. A good example of that is trying to fix the physics issue with code, while ignoring the physics-based solutions, or deciding to setup the developers’ computers based on a financial model only. There should be a feedback, lower-level model, or human feedback at least.

Also, when the model(s) lacks the big set of properties to measure, there is a risk of building the structure that will be trying to provide good measurements instead of an actual output (lines of code, issues fixed, pull requests completed — if focusing on them only — the theoretical community being measured will probably adapt to it by creating more lines, bugs to fix and unnecessary changes just for the sake of the effectiveness cup)

The context

The simple math says: 2 * 5 = 5 * 2, but for a human it might be super different given the context, such as buying the groceries, when you have a choice to travel 5 times to your car while carrying 2 bags, or 2 times while carrying 5 bags, etc.

Even if you model a standalone car driver, even if they will be a very talented instance of it, still, the context can easily decrease the quality of it (example: a good driver can struggle in a place where all the drivers are not that good).

I mean, is it a good idea to measure out of context? I don’t think so. The pull-request count metric is not a good thing when measured alone and expressed as a number. The graph of pull requests one-causing another might be better (one change might allow many more changes to be made), yet still — is it fair to say that its only up to developer to decide on making big impactful changes? With regards to the developers’ productivity, the team is the context, the requests are the teams’ result, so it’s hard to say that the developers’ work can be easily expressed as the number.

The input and the outcome

The outcome (the product, the software, the properties etc.) should not be the only thing to measure, I think. It seems to be necessary to think of the input and measure it as well.

Example: there were times in a past when I had 5+ conversations happening in Teams at the same time. At some point I was thinking of proposing an app feature that will allow the devs to manage the queue easier, including letting the manager to decide what is the best priority. I think its very hard for a human to focus on too many things at the same time due to the weakness of our short-term working memory (chimpanzees are better at it by the way).

Watch on YouTube ↗

The remote era made it harder for people to see how many convos you have now, and I think this could be fixed by just letting others to see how big your decision queue is. It’s a good measurement, but not for the developers’ skills maybe — maybe for studying the communication graph to simplify the processes in general (e.g. directed acyclic comms can be faster “no back and forth”).

Overall

I think it makes sense to model/measure the teams instead, and what they deliver (output) based on an input, and a context, and this alone can potentially help the organizations to get enough feedback to improve. The developers’ productivity can be measured by their teams.

If somebody needs to draw a number that will correlate with the productivity of a standalone developer, they can end up having a number that means nothing, something like the abstract computer performance index. Ok, the index is 10, but the real question is whether I can run the new enhanced Skyrim or not.

So, for me the answer is: just talk to the team members if you want to know more about their productivity and help them to be productive, but don’t use this data out of a context of their team. A good old 1–1 can help a lot, perhaps more than the output-based automation

Originally posted on Medium: https://medium.com/@nettsundere/on-how-to-measure-the-developers-productivity-b7df7d572d2a