In order to facilitate continuous integration and continuous deployment, testing is often looked at as the panacea that enables these activities. Without testing, you are left with a delivery cycle full of manual validations that are slow and cumbersome.
The general guidance with testing is "100% test coverage" or "Test everything, everywhere, always, no matter what". What if I were to tell you that you that the guidance should be "test what matters and ignore the rest".
In this blog post we will explore what that really means and why 100% test coverage is a myth that makes your company just as slow and cumbersome as anyone performing manual testing!
Related Articles
- What Is DevOps Test Automation?
- What Is The DevOps Loop?
- Continuous Integration In Agile DevOps: A Comprehensive Guide
Perfections is the Enemy of Progress
I think we need to first frame the conversation here. We are not sending a rover to Mars or building a device that you put inside of someones body and could be the difference between life and death. Most of the time we are programming inconsequential widgets, gadgets, apps, websites, etc. that may or may not drive an economic purpose. None of these inconsequential spaces require such a high degree of precision where anybody should be looking for perfection.
When building out a fully testable delivery pipeline, the primary goal is far too often a focus getting code out to production as quickly as possible ignoring the ecosystem around that code. Issues then arise and those teams will start a process where they plug the holes in the dam with duct tape and bubble gum tests and short term fixes thinking that they can keep focus on their new initiatives if the existing production system can continue to limp along.
Alternatively, some organizations will look at testing and say we need 100% coverage of all the things everywhere. PRs are automagically rejected when code and all of the tests are not included. This is especially an issue on the front end when someone is making an unordered list, how do you test an unordered list? Sure you can write some asserts, but aren't you just wastefully writing a test that adds frustration and anger for the person who picks up the work next?
Both of these scenarios have one thing in common, an attempt at stability but how they get there are polar opposites.
The primary problem with testing is simply this, there is no real right way to accomplish acceptable testing. It doesn't exist and it probably never will.
This is what leads to the primary question of this article: "What Should I Continuously Test In DevOps?"
The right answer is obviously not "nothing" because that would be silly. The right answer is obviously not "everything" because that is impossible.
The right answer is somewhere in between, and we should talk about that.
The Right Level Of Testing
What Are the Goals of Testing
Automated testing has a few specific goals. Ignoring the massive business (economic) benefit to automated testing, there are some pretty clear developer ergonomics benefits to automated testing:
Efficiency Improvement: Automated tests can run much faster and more frequently than manual tests, allowing for more tests to be executed in a shorter time frame.
Clarifies Requirements: Automated tests act as executable specifications, clearly demonstrating the intended behavior of the system. This helps the next person down the line understand the intent of the code instead of blindly reading code with no way to assume what the intent was.
Consistency and Accuracy: Automated tests perform the same steps precisely every time they are run, eliminating human error and ensuring consistency in test execution.
Early Bug Detection: Automated tests can be integrated into the development pipeline, allowing for early detection of defects and issues.
Rapid Feedback: By integrating automated tests into the continuous integration/continuous deployment (CI/CD) pipeline, developers receive immediate feedback on their code, accelerating the development process.
Scalability: Automated tests can easily be scaled to test complex and large applications, something that would be challenging and time-consuming with manual testing.
As you can see, none of the points arbitrarily assume that you need X% coverage to say your testing is "complete". The idea behind testing is to really do 2 things; make sure that you do not move backwards and reduce the cognitive load of future developers.
Any automated testing effort should be making a developers life easier, not more tedious, and should allow for more creativity and problem solving to happen more rapidly. If those are not happening, you are doing testing wrong.
Software Testing
I don't know that I can offer up anything in this space that has not already been said by ThePrimeagen and Theo.
Also, as a shout out to these guys, I love their real word down to earth view on the applicability of concepts when viewed in the context of "how to be effective in the technology space". There is a lot of bluster out in the internets about how to do something the "right" way, and right is often far less important than how to do something effectively.
Infrastructure Testing
We ignore that infrastructure exists when it is working properly, and we curse everyone responsible for it when even the smallest blip happens. Infra is ignored by everyone who is not responsible for it.
Even though infra is extremely commoditized at this point, the idea of testing infra is still a very challenging problem. Monitoring infrastructure in turn is extremely easy and if you are not doing it, you should be ashamed.
I have often made an argument that the idea of automated testing for infrastructure is synonymous with and should be reusable in an automated CI/CD pipeline. If you are going to go through all of the effort to build monitoring for a system, that monitoring should have more utility to it other than telling you if your system is up or down.
Take for instance running a test in a stage environment that checks to see if a web server is alive via a health endpoint prior to running a boat load of integration tests. I would argue that your test should be composed in a way that it can also easily be picked up by a monitoring system and run on a schedule in production. This test should have enough useful context in the application code to show what it is intending to validate which then gives the operations team the same relevant context without doing any extra work. If the test is stuffed into the code base and only used for integration testing then you are not getting the full value from the work put into that test.
Writing tests like this also builds cross discipline empathy between different teams who generally have different concerns. You can also promote that your operations teams need to contribute to automated testing instead of only contributing to their individual silo'd concerns.
Building A Bridge Towards An Internal Development Platform
An Internal Development Platform (IDP) is the concept of building out or utilizing a set of tooling that are baselines and guardrails to enable teams to self-service their way to production. The idea that a separate team should be a gatekeeper in this area should be put out to pasture and we should, as an industry, be focusing on getting the responsibility of the work as close to the people doing the work as possible.
I will write more about IDPs later, but I wanted to get this shout out into the article to get you thinking about how continuous testing starts to fit into the mindset of an IDP.
So, What Should I Continuously Test In DevOps?
I think there are a few scenarios where you should really pay attention to testing:
Conveying Intent: Are you working on something that is very complex? Is that complexity masked by the elegance of your code since you have a deep and intimate understanding of the problem set you are solving for? Would the documentation process for said feature look like some kind of 4d cube and it is extremely difficult to represent technology in a human readable format? Then you should absolutely be writing tests for your code to ensure that not only is your current thought process enshrined in functional code but the next developer down the line has some bread crumbs to pick up the pieces when they need to maintain and improve on your work. If you are writing something simple, very explicit, and has zero edge cases you can skip the test writing.
Change Events: If you stumble across so code that does not have any testing and it requires you to change that code and you know that the system you are working on has implicit contracts with that code. Again, skip the zero edge case code and focus on the meat. Also, do not try to cover up for other components deficiencies in design with tests by assuming that poor implementations are concrete and cannot change.
Extra Utility: When you can write tests that have more utility than just being single use tests, you should take that opportunity. Any code that can serve many end users is many different disciplines is always the right code to write (as long as it improves their working conditions 😊 )
Conclusion
Navigating the complexities of testing is no small feat, and it's frequently misunderstood as merely a supportive tool. Aiming for absolutes, such as 100% test coverage, is not only unrealistic but also counterproductive. Conversely, the absence of testing is equally detrimental. The essence of my argument throughout this article is the importance of striking a balanced and effective blend of automated testing tailored to your specific environment.
This balance is crucial to maintain a steady and acceptable pace of progress. Any factor that hinders this pace warrants a thorough reassessment, with the goal of identifying and implementing strategies to regain momentum.