Wednesday, July 12, 2006

Small Methodology

Judging from the amount of posts recently, I think I'm writing this to myself. Oh well; dear diary...

In this post I'm extolling the virtues of smaller methods.

When I say small, how small do I really mean? Several times1, I've read no larger than what fits on a single screen although that's certainly debatable.
Code Complete2, the best code construction book ever, has a section titled "How Long Can a Routine Be?" It is surprisingly liberal, basically recommending smaller methods yet presenting some evidence that larger methods are not necessarily bad. It almost contradicts itself in another section, "Scope", wherein it recommends "breaking related statements into separate, smaller routines". The latter advice is given in the context of variable scope, so I suppose the lesson there is that length isn't as much a concern if there are few local variables involved. But... I don't ever recall reading anything advising larger methods. The archives of this site contain no end of examples of huge (edited) routines that caused no end of problems.

In any case, here are some pros for smaller methods:
  • Much easier during maintenance to find the area of code needing change (let's see I know I need to change something somewhere in PlayThatFunkyMusic, but it's 1000 lines long... where?). Note that you are in maintenance mode even when first developing the code.3
  • Code is so much more readable. Instead of trying to figure out if the next 10 lines are relevant to my problem, I read a MethodNameThatTellsMeWhatItDoes.
  • You end up with more efficient code since it's easier to see what you're doing. If you keep adding more and more to a method w/o breaking it out to additional methods, there's a good chance you'll do more than what's necessary. An example that surely came about in such fashion that I came across was a 600+ line method that had a nested 150 line if statement. The nested if checked the same thing as the outer if, but had an else. The compiler probably didn't give the unreachable code error because it was confused - or else I was confused. Either way it made for a very difficult debugging exercise.
  • Communication of intent. Smaller methods with descriptive names are prefferable even to comments. Intent is best communicated in structure and naming. From Martin Fowler, "A block of code with a comment that tells you what it is doing can be replaced by a method whose name is based on the comment. Even a single line is worth extracting if it needs explanation".4

And here are some cons:
  • Function call overhead. Not! Only in a case of a method/methods being called zillions of times would function call overhead (the cost of setting a spot in the register, adding to the stack, transferring execution to the new method, etc...) make a difference. Besides, we never know when the compiler inlines methods under the covers eliminating this perf cost altogether.
  • Developer has to navigate around among more methods to get to where he/she needs to be. There is some validity to this argument, but I think surely outweighed by the benefits and by the fact that there are ways (bookmarks, rapid finding [are you using GExperts?] and navigating among methods) to compensate for this.
In Extreme Programming Adventures in C#5, Ron Jeffries (basically the founder of the XP movement) advises, "The 'right' structure for a method that does more than one thing is called 'Composed Method.' Always create a method that is composed of nothing but method calls, all at the same level, rather than expanding any of the ideas out in line." His example refactors this:

public InputCommand(StringReader reader)
{
lines = new ArrayList();
String line = reader.ReadLine();
while (line != null && line != "*end")
{
lines.Add(line.TrimEnd());
line = reader.ReadLine();
}
CleanTheLines();
}

into this:

public InputCommand(StringReader reader)
{
ReadLines(reader);
CleanTheLines();
}

So each method that does multiple things makes calls to other methods. Once a method does one distinct thing, it no longer needs refinement. It's certainly my experience that such code is very easy to follow.

Oh, and remember - BDS (and Visual Studio 2005) make extracting methods pretty easy. Code on, brothers and sisters. Oh wait, I'm just writing this to myself...

Notes
1) Most recently in "Rules for Developing Safety Critical Code", IEEE Computer, June 2006; Code Complete also mentions this rule of thumb.
2) Code Complete is Avail on the Intranet even though a search for it shows no results.
3) Interview with Andy Hunt and Dave Thomas, "All Programming is Maintenance Programming".
4) Refactoring, Page 77.
5) Extreme Programming Adventures in C# is Avail on the Intranet. Reference Chapter 9 under section, "InputCommand Refactoring".