Recently I’ve been shared with a very interesting article by Jeremy Jarrel about a mutating testing. I was really exited about that cause I really like different techniques and approaches that could help to gather better level of code quality (and tests quality). I would like to share my experience of first try it on project I’m working on now.
What Mutating testing is all about?
The idea is fairly simple. As soon as you have a code that is being tested by a series of unit tests and all of these unit tests are currently in green state, you are able to validate how good your unit tests are, by changing a tested code a bit a see the results. This “changing” procedure is actually called - mutation. In ideal case all green tests have to be turned red, if some of tests are still green that means that testing code is not good enough to react on mutation, so actual test code must be reviewed and corrected. This is something that is called - mutation testing analysis.
The project to try on
I’m currently working on project that has been developed for some time, but with no tests. So a lot of legacy code has to be supported. We started to do tests where and thought about 4 months created nearly 300 tests. It is good time to try mutation on those cases. I’ve assured once more that all tests are good and no failures at this moment. Let’s go!
MutantPower application
In the article that I referenced above Jeremy not only gave a theoretical aspects but also provided with a simple mutation tool, called MutantPower. This is also what I’m going to use throught. So, I’ve downloaded sources and reviewed them. Code is a really simple, uses Mono Cecil framework to load assembly, get all instructions and change instruction Brtrue_S to Brfalse_S, and otherwise. So, I’ve built application and was ready to start experimenting.
First failure
I have a one big assembly with a tested code, so what I need to do is to mutate this assembly and re-run all tests to see results. But on a first run of MutantPower I got a NullReferenceException. I’ve checked out in debugger and the problem was that MutualPower use the Body of MethodDefinition object, but not always methodDefinition contains a Body, so check for null is missing. After some corrections it started to work and I could successfully instrument the assembly.
Re-signing of tested assembly
I’ve successfully mutate my assembly and tried to re-run unit tests. It failed at the very beginning, since it was a strong named, run time detects failure of digital signature and refuses to load library. It has to be resigned each time after mutating. It could be easily done by sn.exe tool.
sn.exe -R tested_assembly.dll C:\Work\rd\Testing\TestApplication\bin\Debug\Company.snk
* This source code was highlighted with Source Code Highlighter.
First unit tests results
After resigning assembly successfully loaded and unit tests started. Some tests were passing and a lot of tests were failing (that is have to be during mutation testing). I was quite happy to see first results. But with mutation you have to be really careful and understand real reason of failures. I found out that most of tests failed with NullReferenceException somewhere from deep parts tested assembly. After a bit of investigation I found out that one of the vital application classes could be instantiated. Sure, MutualPower does not distinguish between “code need to be tested” and “infrastructure code” that makes application able to run at all. So these first results had no sense at all, since the testing not even started.
Configuration for MutualPower
I came up with one simple idea. Extend MutualPower with a configuration file that would say what exact types must be included for mutating to leave infrastructure code as is but affect the rest of code (BLL, DAL, pages etc.). So, I’ve changed the application to take such configuration into account. After inspection of all test cases created so far I came up with such configuration file:
<?xml version="1.0" encoding="utf-8" ?>
<types>
<!-- ASP.net pages types -->
<type included="ard_ardrequest"/>
<type included="ard_periods"/>
<type included="ard_closingsheet"/>
<type included="ard_copyfinancecodes"/>
<!-- DAL types -->
<type included="Company.src.AnnualReport.DAL.ArdCompanyData"/>
<type included="Company.src.AnnualReport.DAL.ArdDepartmentData"/>
<type included="Company.src.AnnualReport.DAL.ArdPeriodData"/>
<type included="Company.src.AnnualReport.DAL.ArdPeriodData.ArdRequestData"/>
<type included="Company.src.AnnualReport.DAL.ArdPeriodData.ClosingSheetData"/>
<type included="Company.src.AnnualReport.DAL.ArdPeriodData.CopyFinanceCodesData"/>
<type included="Company.src.AnnualReport.DAL.ArdPeriodData.PeriodsData"/>
<!-- BLL types -->
<type included="Company.Ard"/>
<type included="Company.ArdCacheClosingSheetData"/>
<type included="Company.ArdUpdateBalance"/>
<!-- Forms and control types -->
<type included="Company.ArdReportTemplate"/>
<type included="Company.EditTypes.ReportDesignerTemplateTypes"/>
<type included="Company.EditTypes.TemplateGroupField"/>
<type included="Company.FormArdCompanyReport"/>
<type included="Company.FormArdTemplateCopy"/>
<type included="Company.FormArdTemplate"/>
<type included="Company.FormCompanies"/>
<type included="Company.FormCompanyGroup"/>
</types>
* This source code was highlighted with Source Code Highlighter.
Now MutualPower will change only types mentioned in configuration file as “included”.
Next run and Timeout exception
After re-mutating and re-signing assembly again I got first more or less meaningful results. Yes, a lot of failures and even more.. my tested hanged out in a middle! This is something that is never have to be happen as soon as you do good unit tests.
System.InvalidOperationException : Timeout expired. The timeout period elapsed prior to obtaining a connection from the pool. This may have occurred because all pooled connections were in use and max pool size was reached.
* This source code was highlighted with Source Code Highlighter.
Skipping the details, I found that there is a huge bug in my testing framework that doesn’t dispose connection to database in case of exception in constructor, so then you have a lot of failures the connection pool is quickly overused and you got TimeOut. Bingo. That might be first valuable result from mutating. Of cause, it is not a but in some production code, but keeping testing code in good shape is also very important. So I fixed the problem and ready to go for next iteration.
Analysis of passed
Finally, I’ve got some results for analysis. Only 27 tests are passing, so all of them should be reviewed for actual reason.
I'll do high-level breakdown of "why tests passed"..
-
Tests from a 3rd parties - we are using HttpSimulator and with a sources, test cases was also included to project. Since we are not modifying and it just works as expected, no need to keep it in test assembly. Should be removed.
-
Tests on code that has not been mutated - tests are passing because you forget to include tested class into MutantPowerConfiguration. Should correct configuration and run again.
-
Empty test cases - test that have no asserts in it. These are not the tests, so it might be either refactored to do asserts either removed at all.
-
Smoke test - the type of test I just to prefer start development with, it just creates a tested object and expects that no failures happened.
That’s it! Nothing that could show a dumb mistakes either in code or tests that I really wanted to archive. So, I’ve cleaned up tests a bit and came up with only 8 passing ones.
Analysis of failed
So, why tests are failing? 99.9% of all failures came from DAL level with a ArgumentOutOfRangeException exception. Meaning that MutantPower changed the assembly so badly that no data could be even retrieved. If no data retrieved -> logic does not start to work -> tests are failed. That turns out to something not mutation, but breaking. MutualPower in its initial version is to simple and for sure required more fine tuning except configuration file. Another aspect related to our architectural/design problem. Currently code of business logic is high cohesion of data access layer, namely all testing is goes on SQL database with no mocks. So, since the lowest layer is corrupted all upper layers just could not stay alive. OK, this conclusion might be a second valuable result gave by mutation.
In ideal case all tests my be clearly separated on BLL tests, DAL tests. It has no much sense to mutate DAL code, as for me. BLL tests must use mocks, so be completely independent on database, database testing should be part of integration testing. By doing this it might happen that mutation give a valuable feedback.
So, just initial analysis of failures showed architectural drawbacks of application that must be some how corrected to get more benefits of mutating testing.
Conclusion
-
Mutation testing is definitely something interesting but requires an effort to get benefits from it.
-
It would give more meaningful results if application is created with a good separation of layers and usage a mocks in unit tests.
-
It helped out to see behavior of tests in more stressed conditions and find a recourse leakage in tests.
-
Some areas for improvements exists to update MutantPower with fine tuning.
-
It is not seems to be daily based practice, but more likely once for iteration of two. Also thinking about criteria related to tests number like, as soon as next 100 tests added, run mutation testing to check quality.
Finally I’ve put sources and binaries on github.
Source code - http://github.com/alexbeletsky/MutantPower
MutantPower application - http://github.com/alexbeletsky/MutantPower/downloads