Wednesday, January 13, 2010

Your Tests Are Making Your Code Inflexible

A Very Naive Approach


Most of us will agree that making code flexible is a primary motivation in writing Object Oriented code. Otherwise, we would write our app as one giant class, with one giant method, and call it "efficient'. We would copy-and-paste everywhere, because we don't want to introduce any "complexity" by adding methods. We certainly wouldn't want to create whole new Classes... then we would have to look in more than one file to figure out how the system works! That is way too prone to error!

Coming back to Reality

No, of course we don't have that mindset. Our code should be split up into as many classes is necessary for each of them to be responsible for one thing. Each class should, in turn, have the appropriate number of private methods to make each of them trivial to understand in isolation. And the public API of the class should be small and cohesive, so as to keep it from having to change for more than one reason.

And tests! We need lots of tests! We need to make sure that each of our classes is tested at a sufficiently low level to be able to test every feature and expectation. This is where we run into a problem.

What is wrong with this code? (Borrowed from JUnit and Clean Code by Robert Martin):

 assertEquals("expected:<b[a]> but was:<b[c]>", failure);

I have only shown a few tests for brevity, but there are many more like it in the book (pg. 283 of Clean Code). Uncle Bob comments on the code, "I could explain it further, but the test cases do a better job. So take a look at Listing 15-1 and you will understand the requirements of this module in depth. While you are at it, critique the structure of the tests. Could they be simpler or more obvious?"

Simple and/or more obvious, does not a good test make.

I agree that the tests are very simple and obvious. They convey what they are trying to test very clearly. However, I have a problem with these tests. They are not very DRY. In fact, on the scale of DRYness, they are unbelievably WET with duplication.

Those who know what DRY means, know what I am talking about already. Basically, there is a ton of duplication in this code. I am about 87.3% sure that Kent Beck (or whoever wrote these tests) used CTRL+C, CTRL+V more than a few times when writing 50 unit tests that look almost identical. It seems that most people think this is okay, because they’re tests. That you should be writing similar code for similar tests in the name of readability. However, I think if you find yourself writing the same code over and over again, you have a problem with focus. Each test should focus on a feature.  You are probably testing the same thing in every one of those tests you copy/pasted. 

Just like you don’t want one giant test that tests everything under the sun, you don’t want tons of tests that all test the same thing when they shouldn’t care about that thing!

In this example, that something is the format of the return string.

Why Won’t the Maintenance Team Invite you to Lunch? Because you Copy/Paste Test Data

So how does this make it inflexible?  What if we wanted to change the "<" and ">" surrounding the discrepancy into "[" and "]". Well, we would do a simple search and replace in our IDE, you say. What if the substitution isn't so easy? What if we wanted to change the first "..." into "--", but not the second set? We would have to manually go into each test and modify the relevant portions.  What about adding an extra parameter to the constructor that these tests don’t care about?

You have to make a change to every one of your tests.  You have to muck around with the code until it passes again, and now you have more nulls, or even worse, more “magic” numbers, in your tests.

One Solution

You don’t have to sacrifice readability to soak up all that wetness in your code.  Lets take one step forward here, shown in the next example (# of tests cases reduced for clarity):

assertFailureHasProperMessage("[b]","[c]",failure);

So we’ve split out each assert into a method. One big win here, is that the formatting of the message, which is clearly not what we are trying to test, is tucked away in the method.  The string still gets tested just as before, but now we don’t see it in each and every test – which now more clearly shows the bits we are really trying to assert.  Unfortunately, passing in multiple parameters gets hard to read really fast, and we want to be able to use our tests as documentation.  Also, wtf is the first number in the constructor supposed to do?  You have to look at about a dozen test cases to figure out its purpose.

Domain Specific Languages

A really good way to make your code self-documenting is to have lots of small, descriptive function and class names.  Well, since our tests ARE our documentation (at least to other developers on the team, and your future self), we should follow this pattern even more.  instead of:

 image

We could have something like:

image

This allows us to concisely state, in almost regular language, what we want the test scenario to be.  It is a mini-language, called a DSL, or Domain Specific Language.  Notice, also, how the “.compact(null);” section is missing.  That is because we don’t care about it.  We aren’t using the message passed in, so the builder object inserts the null for us.  If we don’t care about it, we don’t want it showing up in our test.

Another benefit we get from this approach, is that we aren’t directly manipulating any of the code; we are calling functions on an object, which eventually manipulates the code.  In my opinion, this separates two concerns that rarely get separated: the semantics of the test, and the syntax of the test.  If we change the constructor to take more data, all we need to do is add another function to the interface, and all the tests will again pass (unless there is a bug, of course). 

You see, we shouldn’t care how many parameters the constructor needs in the tests, we just want to input some data, and make sure the output was correct.  We want to work at a higher level.  We are manipulating at a low level, but describing at a high level.  We do this in our production code as programmers, why not in tests?

Now, we can make our test harness handle null information for us, and handle the passing of intermediate data like the “String failure” in the first example.  Below is the entire Test class with the harness.  Obviously, it looks large as there are only two tests shown here, but the harness is mostly a one-time cost, so its effort in proportion to the benefit gets smaller over time.

 image

What Have We Done Here?

Essentially, we have created our own domain “language” for testing.  It isn’t flexible enough to describe all code; just this class or set of classes.  We can also reuse frameworks like this in other tests.  Generally this pattern is called the “Builder” because of its property of returning itself from a setter method.  This allows us to chain methods together and “build” an object with very little boilerplate code.  It is generally used with fluent interfaces and Domain Specific Languages because it allows for a more language-like syntax directly in the code. 

This type of testing may not work for every case, but I would seriously consider it on your next project, and try it out for yourself.  DSLs are somewhat hard to write well, so practice makes perfect (and I am new at this as well, so I have a lot to learn).  You may find out interesting properties of how your code is structured by writing a DSL for it. 

Disagree? Leave a comment.  A colleague I respect “whole-heartedly disagreed” with me on this, so I am open to a little discussion :).