xgerman

xgerman's technology blog

Technology Evaluation with a Balanced Scorecard

Introduction

Occasionally in the life of an engineer management asks should we use technology A, B, or C. they might even ask you to prototype with each and report back.

Decision Criteria

We can of course just report back we like technology A because it's written in Golang (you would be surprised how often this is the sole criteria these days). But let's assume we (or management) wants to do this with some more process. That's why we should set up decision criteria we are going to measure a solution against prior to starting with the task. Let's look at a few examples:

  • Programming Language - is it an approved language Y or N?
  • Community support - how many contributors? How active (1-10 with 10 very active and 1 not so much)
  • Maturity - again 1-10 with 1 not mature at all and 10 very mature
  • How many years will the version be supported?
  • How long did it take to develop a prototype?
  • How difficult? (1-10)?
  • How cute is the project's mascot?

This should give you an idea what to look for – or maybe your organization has a template for such exercises. In any case before you start you should agree with management on the criteria you are evaluating. Then document that.

Weights

Once you have collected the values there is one important ingredient missing: weights.

Certain criteria are more important for your organization than others. So each critera needs a weight. Let's look at an example

Criteria Score A Score B Weight
Programming Language Y N 20
Community Support 3 6 20
Maturity 2 4 10
Support 4 1 5
Prototype - How long 8 5 5
Prototype - Difficult 2 7 10
How cute is mascot? 10 0 20

Looking at the weight the three most important criteria are the Programmin Language, Community Support, and Cuteness of the mascot - whereas they don't care about support. This will certainly be different for your org.

Usually you want to choose some percentage for the weight (with all the weights adding up to 100).

For some values like “How long it took to develop the prototype” higher values are not necessarily better, so we are inversing that, e.g. 8 means for the alloted 10 days it took only 2 - this keeps it in our theme that the higher the score the better it is. The smae is true with Difficulty - the lower the number the more difficult.

I usually reprsent boolean values (e.g. Ptogramming Language Y/N) with 0 or 1 but that might skew the weights (10x is more than 1x) so keep that in mind.

Putting it all together

Now we need to do math: Multiply each score with the weight and then sum it up:

Criteria Score A Score B Weight Final Score A Final Score B
Programming Language Y N 20 20 0
Community Support 3 6 20 60 120
Maturity 2 4 10 20 40
Support 4 1 5 20 5
Prototype - How long 8 5 5 40 25
Prototype - Difficult 2 7 10 20 70
How cute is mascot? 10 0 20 200 0
Total 380 300

Life is complicated enough, so business people like things to be simple and not bother with details: They much prefer to only be briefed about the final score. All they want to know is Solution A scored 380 points vs. 300 for Solution B. After all they employ you to deal with the (technical) details.

Since business people like this so much they gave this technique the name “Balanced Scorecard”. Wikipedia makes it sound it's mostly for performance analysis but it's used everywhere you need to make a selection between several things.

There is still some subjectivity to this process: Is the mascot for Solution A really that cute? To get more obejctivity into the scoring we can have multiple people score the solutions along the criteria and then combine them somehow (popular is using the average between the two for each score).

Taking it to the next level

As an organization we want to continuously improve our technology selection process - so we will not only keep those score cards around but also revisit them occasioanlly to mostly calibrate the weights (e.g. 20% weight for the mascot might be too much). We might also need to add or remove criteria (e.g. if we would have known this is a memory hog we would have done somethign different). At the end we should have at any given time an updated template we can use for future technology selections.