Thursday, December 9, 2010

'Taking it too far' v 'Taking too much time'

Coderetreat is about spending the day writing perfect code. During our day-to-day programming tasks, whether they are for pay or our side-projects, we always have the end goal of getting something done. This inevitably causes us to cut corners, writing less-than-perfect code. Why do we do this? For one, it is because writing the code that we 'know we should write' will take too long. We have to weigh the value of merciless refactoring and ultimate clean code with the reality of the benefit we get from delivering something. At coderetreat, though, the pressure to finish goes away, deleting the code makes it so that you can't finish the problem. This allows us the freedom to act as though we have an infinite amount of time.

As I've traveled around and had the opportunity to work with people of widely varying experience and skill level, I've noticed that the big difference is which corners get cut. For people with less experience, say 1-5 years, the 'get it done' code is significantly different, less clean, than the code written by those with a lot more experience, say 15-20 years, yet written in the same amount of time. This makes sense and isn't a knock against beginners, it is just a good indication that there is always more to learn. The gap between what we actually write and what we think we would write given enough time is always there, just at a different level. As we gain more experience, that gap simply moves upward, so the code is cleaner given the same constraints.

I've facilitated a lot of coderetreats. I've watched people of widely varying skill levels work on the same problem. Some will write very readable code, others will write less readable code. My role as a facilitator is to ask the questions about the code, questioning the refactoring decisions, and, more often, helping people understand why they stopped refactoring.

Suppose you have this method, a very common one used to figure out whether a living cell will die in Conway's Game of Life:



Commonly a team arrives at this and moves to the next test. I'll ask if they aren't skimping a bit on the refactoring and get a response similar to "well, we don't want to go too far" or "this is very readable." I'll generally add a little code:



That is clearly more readable; most teams that see this small change agree. My question is "why did you stop?" The readability also exposes a few other problems, mostly around the method names, but that is a topic for a different blog post. So, the question remains "why is the first form 'good enough'?" After all, the extracted version is more readable, captures the domain better (underpopulated vs a cryptic number comparison) and is not much more work to extract, especially with basic refactoring support.

[Just a note here that this is not the absolute end state of the system, but an intermediate step along the way of evolving the design. If teams get that far, this usually ends up being extracted into another object that resolves some of the design issues inherent in it.]

One hypothesis is that we often mix up the idea of "going to far" with "taking too much time." After all, I would be surprised if someone would say that the extracted version truly is taking it too far. However, I do believe that some people would argue that it might take them too much time to do this all the time. I've worked with people who didn't have a good understanding of their toolsets, so this sort of extraction would take more time: manually extracting it doesn't take a lot of time, but it is definitely non-zero. So, creating code like this is time-consuming on a larger scale, so we don't do it. This, though, is really about "taking too much time."

Another hypothesis is that we aren't comfortable with more abstractions. For various reasons, most of us don't have a great familiarity and comfort with abstractions. One difference I've noticed between people with varying experience levels is a view on abstractions and how they handle them. I like to say that programming is based a lot on just being adept at abstraction wrangling, and my experiences with people has shown me that this has more than a hint of truth to it. As we encounter the creation and manipulation of abstractions, we get a better sense of where they are hiding in the system and how to be effective in drawing them out into the open. A common argument against heavy abstraction is that you then have to jump around the system to see what is 'really' going on. For people who are truly comfortable with it, though, you either don't need to see what is 'really' going on, relying on the names to truly reveal their intent. And, when you do need to see what a specific part of the system is doing, navigation can be a trivial act, due to a clear signpost guiding you to your destination.


These are just musings of mine. I don't have the answers; these thoughts are based on my experiences pairing with different people and watching people of various levels writing the same problem over and over at coderetreats. I'd love to hear any ideas that others have either in comments or as blog posts. I think this is an important topic that can help us understand ourselves better both as a means of accurate self-assessment and developing more effective training mechanisms.