Top TDM Metrics
by Carlos Schults
“You can’t improve what you don’t measure.” I’m sure you’re familiar with at least some variation of this phrase. The saying, often attributed to Peter Drucker, speaks to the importance of metrics as fundamental tools to enrich and improve business processes of all kinds. Metrics are crucial in many different areas. Software development and software testing are certainly no exception. That’s why today we’re here to talk about TDM metrics.
What are TDM metrics? Why should you care about them? In short, TDM metrics are indicators you use to gauge the health of your Test Data Management approach. We already know that TDM is crucial for having a solid testing strategy. A great testing strategy is, in its turn, essential for developing and delivering high-quality software in a timely manner. So, it follows that TDM metrics play a crucial—yet often overlooked—role in a well-rounded software quality strategy.
That’s what this post is all about. We’ll start with a brief but important introduction to the concept of TDM. After explaining what TDM is all about and why it’s so important, we’ll move to TDM metrics. You already know it’s impossible to improve what you don’t measure, but you might be wondering whether there are more detailed reasons to be so concerned about TDM metrics. There are, and you’ll get to know them in that section.
Finally, we’ll walk you through our list of five essential TDM metrics. By the end of the post, you’ll understand not only why TDM is important but also which TDM metrics you should track and improve in order to get the most out of your testing strategy.
Let’s get started.
Test Data Management: A Brief Introduction
Before we move on to TDM metrics, we’ll offer a brief “what and why” of the concept, so we’re all on the same page regarding its meaning, importance, and the reasons behind its use.
TDM, as we’ve already mentioned, stands for Test Data Management. Though we do have a whole other post explaining TDM and its importance in-depth, here comes the TL;DR version:
Test Data Management is the process of providing high-quality data to your test environments in a mostly or totally automated way.
That’s pretty much it, really. However, keep in mind that TDM has to ensure not only the existence of test data but also its quality and its availability. Test data needs to be there when test cases need it, and it must appear in the right amounts and conditions.
Additionally, don’t forget we currently live in the post-GDPR era. There are lots of privacy and security concerns when it comes to test data. For instance, one of the most popular techniques for obtaining test data is production cloning (i.e., copying real data from production servers). So, to protect users’ privacy and keep your organization compliant, there are more procedures—like data masking—you need to perform on the data before it’s ready for use.
Is It Worth Caring So Much About Test Data?
After reading the definition above, you might’ve reached the conclusion that TDM is really difficult to pull off. Maybe you’re even wondering whether it’s really worth it.
Well, managing test data is indeed a challenging task. Is it worth it, though? It sure is.
The importance of test data can be summarized with the old saying, “garbage in, garbage out.” If you feed poor test data to your test cases, you’ll get poor results. Full stop. It won’t matter a bit that you have a solid testing strategy with talented professionals using all of the bleeding edge tools.
Don’t get me wrong; all of those things are good and necessary. But all of that investment will have been for nothing if you don’t care about providing great test data to your tests.
TDM Metrics: 5 You Should Be Aware Of
Without further ado, let’s cover five important traits of a useful test data set. By tracking and improving said characteristics, you’ll be on your way to improving your TDM approach.
What we mean by data literacy is the capacity to process, understand, and analyze one’s data. Data literacy is a versatile term. It’s simultaneously a skill that professionals dealing with data should have and a desirable trait in said data.
Perhaps when it comes to the trait of the data, a better name would be data readability, but the point is the same. Test data should have the capacity to be understood, processed, and analyzed, especially by different computer systems than the ones used to create it. That’s particularly important when it comes to techniques for data obtention that rely on real data being copied from production servers.
The second item on our list is a no-brainer. It should surprise zero people that it’s essential for test data—and data in general—to be secure. Though that’s always been the case, nowadays digital security is more important than never. Organizations must employ all means necessary to ensure that sensitive data—especially users’ personal data—won’t be accessed by unauthorized actors, which includes testing professionals such as QA analysts and testers.
Securing data isn’t just the morally correct thing to do. It’s the legally required thing to do due to GDPR and similar legislation across the world. But it’s also the smart thing to do. Regardless of privacy regulations, protecting user data is the right move because failing to do so harms an organization in many ways, especially by damaging an organization’s reputation.
So, approaches like data masking become vital when obtaining test data from production cloning.
Test data gets old. That might sound like a weird statement, but it’s true. What do I mean by it?
In your application, there might be certain tasks or procedures that use timestamped pieces of data. The procedures might require that those dates be recent. What would happen if the data the test would use was obtained from a production snapshot from three years ago?
That’s right: The tests targeting those procedures wouldn’t work well, or they might produce inconsistent results. That’s why data age might be an important metric to monitor. Depending on how important it is for your testing strategy that your test data is “fresh,” it might be useful to use techniques such as test data aging to artificially tamper with the dates in the test data.
We use “quality” here as an umbrella term to cover a few different traits that useful test data should have. For instance, for test data to be successfully used in test cases, it has to maintain integrity. That basically means it respects the constraints and rules of the database schema and adheres to the domain rules of the application under test.
For instance, suppose your application is a program to manage schools. One of your domain rules is that a student must be enrolled in at least one course. A student not enrolled in any course is a violation.
What could’ve caused such an invalid state? If you’re dealing with “fake” test data, then the answer is a faulty synthetic data generation process. On the other hand, if you’re working with data cloned from production, something might’ve gone bad when performing a process such as data subsetting.
Regardless of the cause, what really matters is having mechanisms in place to enable you to detect and fix such inconsistencies in quality.
Automation is a metric/quality less related to your test data itself and more to your whole TDM approach. How widespread is the use of automation throughout your organization? Are you having data compliance processes built into your DevOps strategy?
More importantly, how automated is the tracking of your data literacy, data security, data age, and data quality metrics?
It’s essential to automate the process of test data profiling and verification. Only with automated alerts and monitoring will you be able to really improve the traits above and take your TDM approach to the next level.
Give Your Tests Some Love by Providing Them Great Data
Today we’ve covered yet another TDM-related topic. Namely, TDM metrics. We’ve argued that since software testing is crucial for achieving high-quality software and great test data is essential for a healthy testing strategy, what follows is that anything you do to improve your
TDM approach is an important component of your overall software quality strategy. TDM metrics have been a mostly overlooked piece of the software quality puzzle. In this post, we’ve set out to change that scenario by offering a list of five TDM metrics that, if tracked and improved upon, have the ability to make your testing strategy more efficient and sound.
This post was written by Carlos Schults. Carlos is a .NET software developer with experience in both desktop and web development, and he’s now trying his hand at mobile. He has a passion for writing clean and concise code, and he’s interested in practices that help you improve app health, such as code review, automated testing, and continuous build.
21APRIL, 2021 by Zulaikha GreerWhat Is Privacy by Design? Millions of dollars go into securing the data and privacy of an organization. Still, malicious attacks, unnecessary third-party access, and other data security issues still prevail. While there is no definite...
31MARCH, 2021 by Ukpai UgochiSo, As the leader of a DevOps or agile team at a rising software company, how do you ensure that users' sensitive information is properly secured? Users are on the internet on a daily basis for communication, business, and so on. While...
24MARCH, 2021 by Taurai MutimutemaKnowledge is more important than ever in businesses of all types. Each time an engineer makes a decision, the quality of outcomes (always) hangs on how current and thorough the data that brought about their knowledge is. This...
15MARCH, 2021 by Carlos SchultsIn today’s post, we’ll answer what looks like a simple question: what is data fabrication in TDM? That’s such an unimposing question, but it contains a lot for us to unpack. What is TDM to begin with? Isn’t data fabrication a bad thing?...
08 FEBRUARY, 2021 by Zulaikha Greer Data is the word of the 21st century. The demand for data analysis skills has skyrocketed in the past decade. There exists an abundance of data, mostly unstructured, paired with a lack of skilled professionals and effective tools to...
04 JANUARY, 2021 by Ukpai Ugochi Have you ever wondered what would happen if you mistakenly added bugs to your codes and shipped them to users? For instance, let's say an IT firm has its primary work tree on GitHub, and a team member pushes codes with bugs to the...