How should you unit test private methods? It depends. Some say you shouldn’t do it at all. Others say you should test everything. Who is right?
The problem that I have with a statement like, "Don’t test private methods," is that it is a circular argument. That a method is private implies that you don’t want to test it (or haven’t thought about testing it). So it’s like saying, "Don’t test stuff that you don’t want to test." That’s not very helpful.
Similarly, when people say "test everything," I think they’re being a little vague. I can go along with "(unit) test everything that can practically be tested," but I’m not going to pretend that unit testing will take the place of integration testing, that it is possible to test every condition in a complex software system, etc.
Last year, I asked the question, "Why hide information?" Without needing a definitive answer for that question, let’s agree that, given a certain use case and the code which addresses it, there is some portion which must be public in order to be useful. Internally, however, there is probably some additional code which does not need to be public in order to solve the problem at hand. One popular school of thought says that code which is not needed in the public interface should never be public. (And similarly for protected, etc.)
At this point, I would like to point out that testing code is a legitimate use case. I commonly refactor functioning code to make it more testable. This is code which appears to work just fine, and requires no changes in functionality to the end user. Despite the aphorism, "if it ain’t broke, don’t fix it," I’m willing to make these changes, because I think that the ability to run more tests is a good feature to add to the software.
Some would argue that testing the public interface of a piece of code is sufficient, because any internal bugs which are not reflected in the public behavior are not relevant to how the code is actually used. This is true insofar as it goes, but it turns out that exhaustively testing the public behavior of a complex collection of code is extremely difficult, especially if that code has state.
Let’s imagine a function called Foo which accepts an integer argument and returns some value, but doesn’t use any kind of state. There are, therefore, 2^32 possible test cases for this function. Imagine another function, Bar, with the same characteristics. Again, 2^32 possible test cases. That’s a lot, and perhaps we would not want to test every possible case, but even if we did, it’s nothing that a computer can’t handle. Now imagine a type with a public function called Baz which uses both Foo and Bar internally, passing some internal state of the type. Because Foo and Bar are not needed for the end user’s use case, we decide to make them private. The list of possible test cases for the type now probably exceeds what can be performed with a fast computer in your lifetime. Typically, we try to work around such excessively large numbers of test cases by probing boundary conditions. But if private state is involved, this turns out to be very difficult to do. Given that making a field private implies that the implementation may change, probing the behavior of Baz by trying to make it hit the extremes of Foo and Bar is both difficult and likely to be a maintenance nightmare in the future.
In such a situation, I would argue that the need to test the behavior of Baz is by itself reason enough to make Foo and Bar public. If we are positive that both Foo and Bar work correctly (e.g., because we have exhausted the unit tests), then testing the behavior of Baz is either easy or so trivial as to be unnecessary. Because we don’t want to pollute the namespace of the type containing Baz, and because neither Foo nor Bar use any internal state, it would probably make sense to make them public members of a separate type, probably a static type. The point is that even if they are never used anywhere else in the system, it is worth making them publicly visible for the sole purpose of improving the coverage and efficiency of our testing.
{ 16 } Comments
> Let’s imagine a function called Foo which
> accepts an integer argument and returns
> some value, but doesn’t use any kind of
> state. There are, therefore, 2^32 possible
> test cases for this function.
Well, that’s a dodgy argument for several reasons
Just because the return type is integer does not mean there are 2^32 possible return values. For example:
int Foo(int a)
{
return a * 2;
}
That only has 2^16 possible results (and can also throw an exception for sufficiently large a).
Also, for such functions, there is no point unit testing each possible return value. Even with the computer doing it, 2^32 tests will take ages (and I don’t want to be the one writing them all!)
But more importantly, that type of testing would not be testing the function, which requires a maximum of 2 or 3 tests. Exhaustive tests would be testing how multiplication was implemented by the compiler and CPU, which is definitely not required in normal situations.
You should only test the code in the class under test, not external code. In any case, I probably wouldn’t bother writing any tests for functions as simple as Foo, even if they were public.
In those (rare, IME) situations where a private function is so complex as to require tests, IMO it is generally preferable to break it out into a new class, and test it that way.
Which functions are "too simple to test?"
Jim’s function Foo will silently overflow in both C# and C (and Delphi, too, if transliterated), with default compiler settings. (Contrary to popular belief, C# does not do integer overflow checking by default, even in debug builds.) So I wouldn’t expect any exception. Perhaps this function is not so "simple" as it appears!
I said that there were 2^32 possible test cases, not possible return values. In a referentially transparent function, the number of possible test cases is wholly determined by the number of possible inputs. (And that’s the easy case; non-referentially transparent functions are worse, since state is a "hidden argument.") And, yes, as I noted in the post, you probably wouldn’t want to test all of them.
But it is precisely that — the desire to write a few targeted tests instead of to exhaustively test every possible test case — which makes the behavior of Foo and Bar so difficult to probe from the outside. If it was possible to exhaustively test every possible input to Baz, including the state of its type, then the use case for testing would be precisely the same as the use case for Baz. But I don’t know anyone who’s ever managed to do that, and I rather doubt it’s possible.
You’re going to have a very difficult time writing a test for Baz which hits all of the boundary conditions for Foo and Bar, especially when the state of the class containing Baz is considered. And even if you managed to do this, it will probably be broken the first time you rearranged the internal state of the type.
Some good points, thanks! I also think that If testing smaller chunks of code (e.g. Foo and Bar) makes testing a bigger chunk (e.g. Baz) easier or not necessary, you should do it.
Something Michael Feathers wrote that I find true:
* If there are private methods that you want to test (maybe you’ve even extracted these methods just now), it’s legitimate to make it public just for the sake of testing.
* If after you’ve exposed the methods they look like they don’t belong in the class, they’re probably appropriate in an extracted class.
> Jim’s function Foo will silently overflow in
> both C# and C … So I wouldn’t expect any
> exception.
But since I practice TDD, one of my tests would have been to expect one, so I’d soon find that out
However, my point is that counting the number of possible inputs is not a remotely sensible way of calculating the number of required unit tests (excepting possibly some bizarre corner case I can’t think of).
It’s a total red herring. You only need to test each possible code path - which is possibly not quite the same at the cyclomatic complexity rating for a routine, but the number will be similar.
Similarly there is no need to exercise Baz in the way you suggest. Again, as long as al the code paths are exercised, then that’s all that’s required. If you can’t exercise all the code paths in Foo or Bar that way it clearly doesn’t matter, since they cannot be executed.
> You’re going to have a very difficult time
> writing a test for Baz which hits all of
> the boundary conditions for Foo and
> Bar, especially when the state of the
> class containing Baz is considered
I don’t agree with that at all. How many boundary conditions do you think a routine can have? Many/most of them don’t actually have any, and of the ones that do, "null" is pretty much it. And a lot of them you can effectively ignore - would you really try and create a 2G string (or whatever the max length is) to "boundary test" any routine that has a string parameter?
If Foo and Bar are sufficiently complex as to make testing them via Baz too hard, they probably don’t belong where they are. In practice, I find it is very rare to require making anything complex public for testing purposes. More often I need something simple (eg to test if I really did add/remove an item from that private collection)
> Which functions are "too simple to test?"
Most constructors. Getters and setters on properties that are like this:
public int Stuff
{
get { return stuff;}
set {stuff = value;}
}
I’m going to assume the compiler works as advertised without me writing tests to confirm it.
I might not test your Foo and Bar routines. How complicated can they be with only 1 integer input and no use of any state?
In classes that delegate responsibility to other classes (eg that use Strategy, or DI), it’s quite common to have methods that just pass through parameters or function calls, eg
public Money CalculatePay()
{
return payObject.CalculatePay();
}
There’s no point testing that, really.
Often, such routines actually get exercised somewhere else (like constructors do, for example), so the code is called. But it’s too simple to bother writing a separate test for.
I’m very keen on unit testing and have been for many years now. I use TDD a lot. But I don’t believe in 100% coverage with unit tests alone.
And I have never believed that exercising every possible input (or output) value is necessary.
The quantity of boundary conditions is not the question. The question is whether you can be assured of hitting them or not. But when the state of the instance is an argument to a private method, you cannot be assured of anything.
The "too simple to test" question was rhetorical, since the function that you asserted was "too simple to test" turned out to be incorrectly specified.
> The quantity of boundary conditions is
> not the question. The question is
> whether you can be assured of hitting
> them or not.
When the number of boundary conditions is zero then the questions are equivalent. Testing every possible input will presumably test all boundary conditions, but is hardly an efficient method to do it, since the number of boundary conditions is typically very small (single digits). And like I said, there are times you don’t bother, as the boundary conditions are just silly.
> But when the state of the instance is an
> argument to a private method, you
> cannot be assured of anything.
Of course you can, otherwise private methods would be too dangerous to use.
Also, if the public methods do not fully exercise private methods, then there is an argument that there is not really any need to test the unused code paths of the private methods.
> The "too simple to test" question was
> rhetorical, since the function that you
> asserted was "too simple to test"
> turned out to be incorrectly specified.
Eh? What specification are you talking about?
IMO there are functions too simple to be worth testing. I listed some of them. I’m not sure how that makes the question rhetorical.
If there are not boundary conditions, then just substitute "useful test cases." The point is precisely the same; when state is used as an argument, you have little control over the test conditions.
I don’t agree that private methods would be "too dangerous to use" whenever "state is an argument." To the contrary, they are quite useful whenever, well, you need to deal with state. A trivial example is a property getter or setter. A more complex example might call a number of (tested!) functions, passing the state along.
Finally, the specification I referred to was the one in your comment, namely that Foo "can also throw an exception for sufficiently large a." This is demonstrably incorrect.
I feel like we’re going in circles here, and getting away from the original point, namely that testing is by itself a valid use case. I don’t know of any discipline of engineering where this isn’t true. Circuit boards will have test ports built into them. Bridges have expansion points with settings to attach a caliper to check dynamic behavior. In high-energy physics, detectors use a variety of calibration signals in order to check the values returned. We should not pretend that this is somehow less valid than the use case the end-user wants to complete.
> I don’t agree that private methods would be "too dangerous to use" whenever "state is
> an argument."
Neither do I - and that’s exactly what I said.
> Finally, the specification I referred to was the one in your comment, namely that Foo "can
> also throw an exception for sufficiently large a." This is demonstrably incorrect.
I see. Well, as you yourself implied, the code **can** throw an exception. You need the right compiler options, but it still can.
So if you were testing Foo you would write at least one test:
[Test]
public void Foo()
{
Assert.That(Foo(1), Is.EqualTo(2));
}
So you’re covered if someone changes Foo to multiply by 3 or something.
If you wanted the code to throw an exception (you might not, of course) you might also test for that - the boundary conditions, if you will :
[Test]
[ExpectedException(typeof(OverflowException))]
public void FooMaxInt()
{
Foo(int.MaxValue);
}
[Test]
[ExpectedException(typeof(OverflowException))]
public void FooMinInt()
{
Foo(int.MinValue);
}
At this point you could discover you do not have the compiler switch right, so the tests fail. You turn the switch on and the tests pass. TDD in action.
> I feel like we’re going in circles here, and getting away from the original point, namely
> that testing is by itself a valid use case.
I’m a big proponent of unit testing, but I didn’t agree with your 2^32 numbers and the subsequent argument. I still don’t.
That you should write testable code I completely agree with, hence my use of TDD whenever possible. That you might need to refactor to be able to test something more easily is of course true. Testability is one measure of design quality.
However, I do not (in general) equate unit tests with use cases. Unit testing is for classes; use cases should not be that detailed (you’d have to refactor your use cases all the time, it’d just be silly).
In principle, I wouldn’t make unit testing a use case either, I don’t think. In my mind it is just good, professional programming practice. You should just do it whenever it is appropriate. I see no need to explicitly say when it should be done.
Higher level tests (UI, integration, performance etc) are required to test use cases or user stories, IMO (as they are higher level constructs than classes). And I wouldn’t get so recursive as to make that sort of testing a use case
I would get someone else to do that type of testing, too (seriously).
Ah, so you do agree that Foo is not "too simple to test." Me, too. Which gets us right back to the beginning.
> Ah, so you do agree that Foo is not "too
> simple to test."
Actually, no I don’t.
In the real world, I wouldn’t normally test it, as, yes, I think it is too simple to test (in most situations, anyway). It is certainly not worth making it public solely so you could test it.
The example tests I gave were just to demonstrate that 2^32 tests are not required to completely exercise the code. I think 2 or 3 does it, don’t you?
Basing your argument on the possible number of inputs was a red herring, IMO.
Of course I don’t think that 2^32 tests would be required. Which is why I wrote this in the post:
That’s a lot, and perhaps we would not want to test every possible case…
If it seems that I’ve based my argument on the possible number of inputs, then either I was not clear, or you miss the point. The argument is, how do you ensure that you have hit the useful test cases? Without the hidden input of state, this is really easy to do. When state is an input to the function, it is very difficult to do.
If the public function under test is guaranteed to be a referentially transparent function, then I agree that testing the public function is all that is required, no matter what it may do internally. But it is both difficult and rare to make that guarantee in Delphi or C# code.
I do understand your argument Craig, I just don’t think it’s right
> The argument is, how do you ensure that you
> have hit the useful test cases? Without
> thehidden input of state, this is really easy
> to do. When state is an input to the function,
> it is very difficult to do.
It still seems to me that you are worrying about the number of input values to methods. You are then getting concerned about combinatorial explosions of the number of test cases that might be required if internal (not visible to unit test) state is used in those methods.
But input values are not how you work out what tests are required. You only need to be concerned about code paths, both the obvious ones that coverage tools can measure (eg cyclomatic complexity), and less obvious ones like exceptions (and non-exceptions like the integer multiplication example, possibly).
Now, if a public method uses two private methods, you only need to supply the public method with sufficient input values to exercise all the code paths in those 3 methods. That’s got to be orders of magnitude less tests in almost all cases
It is possible that you will not be able to exercise all the code paths in the private methods by calling the public method. But that actually doesn’t matter, since any code calling that method can’t hit those code paths either. If you extend the same reasoning throughout the whole public interface of a class, you can see that it may not be necessary to get 100% test coverage.
This is one reason why private methods rarely need direct testing.
My experience suggests the heuristic of only testing public methods works pretty much alway the time. Any complex private stuff that needed testing I’ve always had to extract elsewhere, as far as I can remember.
Needing to unit test a private method directly is a code smell, IMO.
Generally speaking, I find a human brain is required to determine the required unit tests, but there is an interesting tool under development at MS:
http://research.microsoft.com/en-us/projects/Pex/
Since I prefer to do TDD, it’s not that useful to me currently (all new code, fortunately), but it might be quite handy for legacy code.
You’ll note that I linked Pex in the post.
But Pex works on methods, not classes. It won’t solve the problem of state. It is really useful for TDD, though, since it can find cases you might have missed, with nothing more than a prototype.
It still seems to me that you are worrying about the number of input values to methods.
No. I’m concerned about being able to hit the "interesting" cases. Exhaustive testing is just a really, really inefficent way of doing that.
It is possible that you will not be able to exercise all the code paths in the private methods by calling the public method. But that actually doesn’t matter, since any code calling that method can’t hit those code paths either.
Once more: Unless the methods under test are truly referentially transparent, you can’t know this.
> Once more: Unless the methods under test are
> truly referentially transparent, you can’t know
> this.
That’s not true. Of course you can have non-RT methods where you know you have non-executing code paths.
> It is really useful for TDD, though, since it can
> find cases you might have missed, with nothing
> more than a prototype.
This statement seems to imply something unusual about your understanding of TDD to me.
My experience of it was that is was pretty pointless for true TDD - after all, there is no code for it to generate tests for, until after you’ve already written the tests, right?
You could run it against your code later, as some sort of sanity check, but I didn’t find that a useful exercise.
For existing non-unit tested code (which is approximately all the code ever written) it comes into its own, although I frequently found a need to delete irrelevant tests in my trials.
That’s not true. Of course you can have non-RT methods where you know you have non-executing code paths.
Huh? That is irrelevant to what I actually wrote.
My experience of it was that is was pretty pointless for true TDD - after all, there is no code for it to generate tests for, until after you’ve already written the tests, right?
Have you tried its assertions?
{ 1 } Trackback
[...] the applications and provides testing and diagnostic information at runtime. I said before that testing is a valid use case, and this is one example. Another, report-related example is that whenever our applications [...]
Post a Comment