Metrics and misadventure

JohntheFish a year ago CodeProject Management

Its drummed into us from our first software design course. Simpler code is better code. I won't try and deny that.

This is about the next step of that premise. The idea that simpler functions of code are better. There are many metrics for this, from just counting lines of code to algorithms that work out complexity based on scoring loops and conditions or semantic nets. In the object oriented programming world we also have to consider functions and data structured into classes or objects. So we also strive for simpler classes or objects. For now, lets keep objects and classes out of it and consider lines and functions.

Trading complexity

To make my point, consider an example. We have 1,000 lines of code in a single function. Taken to a ridiculous extreme, the smaller is better argument would give us 1,000 functions of 1 line of code each. It doesn't quite work like that, introducing functions requires extra code for interaction, so we would actually have 1,000 functions of 3 lines each, or a total of 3,000 lines. In practice, there is no real code that could actually be refactored that far save by creating some stupidly complicated lines. Nevertheless, bear with me for a while.

When we consider any one of these tiny functions, it achieves the lowest possible complexity score by any metric. Each function becomes trivial to test. We have satisfied the gods of our complexity metric, so our code must be better?

What gets lost in this extreme is that whilst we now have fewer lines in each function, we have also introduced the complexity of 1000 functions interacting in a way that is not so obvious because we can't see all the interactions between those functions at a glance.

We have traded small scale complexity for large scale complexity.

To avoid that extreme, proponents of complexity suggest an optimum complexity range for a function and for a class. Depending on what value is flavour of the month, our 1,000 lines of code breaks down into 100 functions of 10 lines each nor 50 functions of 20 lines each. That sounds good, this size of function is something we can all get our heads round. But we still have to understand how those functions interact. There is no getting away from global complexity.

Managing global complexity

Now consider a real system such as an Addon or Theme package for Concrete CMS. 1,000 lines of code is actually quite small, about the size of a simple block. Its not unusual for an addon to have several thousand or tens of thousands of actual lines of code. Themes can be larger still.

So how do we manage the escalation of global complexity? Here are my own guidelines. As with all guidelines, these are not rules. Don't be afraid of breaking them.

Everything has a natural size.
Keep interfaces simple and document them. Documentation needs to include the how and why, not just the self evident what that an IDE generates.
Don't break down size purely for the sake of it.
- Only use trivial functions, classes or objects where they reduce duplication.
- Avoid introducing intermediaries that don't actually do anything.
Setters and getters are OK. Try and keep them out of the way of everything else.
Remember these are guidelines, not rules.

Even on a one-developer project, you will thank yourself later.

Keep your eyes on the goal

The objective is a functional, reliable and maintainable system.
Not a perfect metric score.

All

Discussion

If you would like to discuss any of these thoughts, please start or continue a thread on the Concrete CMS Forums.

Goodhart's law

When a measure becomes a target, it ceases to be a good measure.

Campbell's law

The more any quantitative social indicator is used for social decision-making, the more subject it will be to corruption pressures and the more apt it will be to distort and corrupt the social processes it is intended to monitor.

More formal development

Whilst small functions are easier to test in isolation, in a formally structured development with full unit testing, what this achieves is to delay the real testing to later, when those units are integrated.

Small units can actually work against the objective of finding bugs earlier where they are cheaper to fix.

Lets push that aside, few web developments are so formally structured to have 100% unit testing. Most testing of web applications is done at integration time. So finding bugs at a unit test level is not a major influence here.