We Take Software Quality Seriously
At Dovetail, we take software quality seriously. I wish I could guarantee our products were 100% defect-free, but I can’t. We do everything we can to get as close to that ideal as possible, though. In this post, I’d like to show you how we approach software quality during product development.
To understand our approach to software quality, you must first understand our philosophy. We believe quality is a mindset. It’s not something you do as Step 1, Step 4 or after the software has been developed. Quality is not the responsibility of one or two people on a team or the “Quality Team.” Quality is the responsibility of everyone involved in the product’s development – at all levels. I like to say, “quality is everyone’s job.” If a defect is found in the software by a customer, we all failed a little bit and we take it very seriously. How we process defect reports from customers is the subject of another blog post, however. As I mentioned earlier, this post is about everything that happens before we release a product to customers.
Starting from the very beginning of the initial concept of a feature, we discuss how the feature should operate, how we can verify that it operates properly, and what problems (bugs) might be introduced. A rough test plan is developed along side the feature specifications.
As the software developers begin coding the feature, they follow the same pattern:
- Write an automated test case that fails.
- Implement just enough code to make the test case pass.
- Repeat starting at #1.
The test cases are based directly on the specifications of the feature. Sometimes the test cases test things behind the scenes (machinery deep inside the application), but they are ultimately related to some aspect of the requirements for the feature currently under development.
The idea here is that, to the maximum extent, no code is written that doesn’t have at least one corresponding test case. We end up with dozens or sometimes hundreds of these test cases for a single feature. Overall, we have thousands of these test cases which we call “unit tests.” We run these unit tests dozens of times a day. Since we run these unit tests so frequently, they must run fast. We try to keep a run of thousands of unit tests to just under a minute or so.
We’ve even set up a separate server that watches for changes (when a developer commits code to our source control repository), downloads the latest code, and runs all the unit tests. It does this dozens of times a day also. If it runs into problems or tests fail, it emails us to let us know something is wrong. This process is known as “continuous integration.”
Unit testing helps ensure that the specific things the developers coded work according to how they’re expected to work. It also ensures that, at a code level, the changes they made didn’t immediately break unrelated code or other features.
We have multiple levels of verification for unit tests: The individual developer’s PC, other developers’ PCs, and an independent server. This avoids the “Works on My PC!” excuse you sometimes hear from developers (including me) when the software works for them, but not for anyone else. For us, the software must work correctly everywhere — no excuses!
Another level of automated testing we do is called “integration testing.” This level of testing involves more aspects of the system operating in concert. Whereas a unit test tests only a specific “unit” of functionality or code, an integration test tests multiple features together.
This is where many defects can occur. An individual feature may work fine by itself in isolation, but when it has to interact with a database or another service, unexpected problems may occur. Integration tests help ensure that the feature works correctly in the context of the entire system.
Since these tests are a little more comprehensive and test a broader set of criteria, there are generally fewer of them and they each run a little slower. We have hundreds of these and, combined, they take 5-10 minutes to run. We’re always trying to find way to reduce this time as we like to run our tests often. Like unit tests, these tests are run frequently by each developer and on the integration server to avoid “Works on my PC!” surprises.
The third and final level of automated testing we do is called “acceptance testing.” This level of testing involves full automation of the entire application. Our tests actually open a web browser and simulate human behavior by clicking on buttons, typing text into fields, and generally navigating and using the application. We have invested considerable time and resources in creating a system where we can author, execute, and verify these automated acceptance tests. These tests catch the most amount of bugs and have paid for themselves many times over.
These tests are less “code-focused” than the other levels of tests and try to use the language of our customers. The idea is that non-technical people can view the test specification and understand what is being verified by a particular test.
Up until this level of testing, developers were doing all the work and none of the other team members were involved. At this level, our testers, product managers, and information developers can all have a hand in authoring, editing, and running tests.
This image is an example of a simple test that verifies that, when a Case is in the “Closed” condition, an agent cannot modify the tags associated with this case.
When run, this test will open a web browser, log in as an agent, navigate to Case 001, and verify that the Tag control (or area) is not visible or usable by the agent since Case 001 is closed.
We have hundreds of these tests, as well. Unfortunately they take awhile to run (over 30 minutes). We usually run these tests a few dozen times a day, or at least one time a day.
There are some situations where a test simply cannot be automated by any technology we have. There are also some cases where a human eye and hand are necessary to truly be sure that a given feature or set of features works properly. This is where our testers come in.
Until this point, our testers have been working with developers on what to test and how to test it, authoring and verifying acceptance tests, and generally overseeing quality and progress in development. As the feature begins to take firm shape, a tester will begin pulling down the latest build, installing it, and interacting with it several times a day. As they find issues or inconsistencies, they point these out to the developers who discuss and fix them promptly if necessary.
When the feature is considered done and there is no more active development going on, our testers will do another manual sweep through that feature and related features in a last attempt to ferret out any issues that had been overlooked up until this point.
At some point during development or manual testing, our information developers will begin documenting the new feature. During this process they actually interact with the feature in the application. This serves multiple purposes, most importantly to exercise the feature by someone who is not a developer and not a tester.
There have been many times when information developers will find issues, idiosyncrasies, or inconsistencies in the application – usually because they’re trying to explain the feature from an end-user point of view. Sometimes developers and testers can miss some issues and extra people viewing the feature from different viewpoints can clear up these issues before we release to customers.
When the feature is done, tested, and documented (or very close to these three things), other members of the Dovetail team will preview the new feature about to be released. This can include sales people, executives, product managers and developers from other teams, etc. The more people looking critically at a feature, the more likelihood of us finding problems before the software is released.
In this blog post, I talked about the many levels of testing and review that our software goes through before it’s considered “done” and we release it to customers. While it doesn’t guarantee perfection, it does ensure a very low defect count.
As you can see, we take great care in ensuring that when we say our software does something, it really does it. All these levels of testing help not just with ensuring correctness, but also finding any obvious perspectives we may have missed or features that are poorly fleshed out.
The end result is correct software that is usable, makes sense and, most importantly, delivers value.