Thursday, April 05, 2007

Strangely Duplicated

I think avoidance of duplication is one of the most important maxims to follow in software development. Martin Fowler wrote in Refactoring, "By eliminating duplicates, you ensure that the code says everything once and only once which is the essence of good design". I could've sworn he also wrote, "all information in a system should be represented in one place", but I can't find that quote anywhere. Anyway, I would argue that it is more than just good design; it's essential if one wants to avoid problems.

Duplication was a problem I pointed out in some code recently. A design decision was made that is quite simply mind-boggling.
Here's the scenario: The application runs desktop/web which share business objects. A huge dummy object (discussion in itself) contains all input UI values. For desktop, the UI is refreshed based on the values in the UI object after validation on that object occurs. For web, a message pops up (someone else's decision) rather than the UI being refreshed.
So the decision was made that for the desktop, we merely reload the values (after validation may change them in the dummy object) into the UI. Simple enough. For the web, we add an additional giant object that contains the before and a copy of the after values. And for every single variable in the dummy UI object that is validated (>500) there is a separate method that:

  • Caches the original value

  • Sets the new value (if necessary)

  • Stores the new and old value along with a string indicating the name of the variable.


When asked why this was done, the initial reply was merely because the web "shows a user message rather than simply changing the variable on screen". When asked why it couldn't simply compare against the values in the UI object, the reply was essentially the same ("it needs to show a message"). In addition, the coder mentioned that the new data structure (with the duplicated data) could be a place to add additional functionality (can you say "YAGNI"?).

I pointed out (rather casually) these problems:

  • Duplication of data.

  • Forces the developer to remember to include a method copying the algorithm above for every new validation rule (which I already forgot to do once, prompting the discussion).

  • The above algorithm is duplicated ad infinitum.


The response was that this is merely my opinion, but I personally think it transcends opinion. If data is ever to be duplicated, there should be a very good reason for it because duplicated data opens the door for corrupt data - you may as well count on it.
In this case, not only is there no reason, one wonders how such a solution could even be contrived.

Anyway, this highlights 2 important red flags in the software development process:

  • Data duplication.

  • Multiple methods with near duplicate code.



IMO, if you see the former, your design needs correcting. If you see the later, figure out how to combine the code into a single method. If that's not possible, your design needs correcting.