You should be using data-testid

You should be using data-testid

If this title made you upset, you've probably read one of these:

React Testing Library docs:

Using data-testid attributes do not resemble how your software is used and should be avoided if possible. That said, they are way better than querying based on DOM structure or styling css class names.

Derek Davis' blog post:

Simply put, accessing everything through test ids isn't testing your application the way a user would, which is our ultimate goal. We're relying on an arbitrary id, an implementation detail, to access a DOM node. This certainly works, but there's plenty of room for improvement.

Kent C Dodds' blog post:

Sometimes you can't reliably select an element by any of the other queries. For those, it's recommended to use data-testid (though you'll want to make sure that you're not forgetting to use a proper role attribute or something first).

The Playwright docs:

Testing by test ids is the most resilient way of testing as even if your text or role of the attribute changes the test will still pass. QA's and developers should define explicit test ids and query them with page.getByTestId(). However testing by test ids is not user facing. If the role or text value is important to you then consider using user facing locators such as role and text locators.

Common Objections to data-testid

  1. Testing implementation details is an anti-pattern, and data-testid is an implementation detail

  2. Real users don't access elements by data-testid

  3. We don't get access to implicit accessibility assertions that come with using "user-facing" strategies like getByRole or getByLabel

  4. The data-testid ends up in production code unless you remove it, which exposes your test coverage to your users

  5. Adding code that only serves tests is a waste of development time

#1 - Testing implementation details

The argument against relying on implementation details is a good one:

We want our tests to break when behavior changes, not our code.

This is great wisdom, but data-testid is an exception to the rule. Why? Developers don't change the data-testid unless they remove the HTML element from the document. They have no reason to modify this attribute because their code does not rely on it for anything.

Meanwhile, attributes like role, text, or label, while unlikely to change frequently, are still more likely to change than a data-testid, because they are tied to user-facing functionality.

No user-facing functionality stays the same forever, which makes these attributes inherently unreliable in the long term.

Because the data-testid only serves test code, we get to decide how reliable they are, not the product roadmap.

#2 - Testing like a "real user"

This assumes we're able to mimic a real user's behavior with an automated script.

I don't know about you, but I don't typically zoom through an app interface at the speed of light. There are plenty of times an automated test fails, and the root cause is "the test user is going too fast."

There's nothing "real" about the end-to-end tests we create with tools like Playwright. They're simply meant to approximate the end-to-end experience, and even with data-testid, they do a great job of that.

No matter how you target your DOM nodes, you're still targeting DOM nodes.

Inspect view showing the properties of HTML elements on the page

You're not targeting a visual interface the way a real user would, and you're not licking the Cheetoh dust off your fingers before clicking a button (because I don't want Cheetoh dust on my mouse, thanks).

If that doesn't make sense, look at it another way.

GoodReads login form

When you are filling out a login form as a human, which do you see:

a) the image of a login button


b) <input role="button" data-testid="submitBtn" type="submit" >Login</button>

If you want the "real" user experience, maybe you'd be happier with automated tests that leverage computer vision rather than a getByRole locator.

#3 - Implicit [a11y] assertions

Accessibility is important. No way am I going to argue against that.

But hey, we can do better than a getByRole or getByLabel, can't we? Of course.


Here you go:

I'd rather do explicit accessibility testing than rely on a locator strategy. Playwright fully supports this Axe integration FOR FREE:

The problem with relying on locators to implicitly verify accessibility? Your tests are not interacting with the entire document of every page, so you'll miss plenty of a11y issues that tools like Axe can find for you, including the ones your locators would find.

example table of a11y violations detected by Axe tests

If you care about accessibility, don't test it halfway, and test it in a way that isn't impacting the resilience of your locators.

If this is too much work for you, or you think this is a little heavy-handed, what does that say about your concern for your app's accessibility? Seems like if a11y is a first-class citizen, it's worth a few more lines of test code ¯\_(ツ)_/¯

#4 - "Leaking" a data-testid

I don't even know why this gets brought up. Security? Are we concerned about a hacker knowing how much coverage we have? Are we concerned that easier test scripting for us means easier malicious scripting for them?

If y'all are worried about making your DOM nodes are easier to target, let me assure you as a QA who has had to use many different locator strategies:

Taking away a test id won't save you. There are plenty of ways to target DOM nodes.

And if you're worried about them seeing which areas have poor test coverage? Cool. Don't have poor test coverage, then.

Or, if you insist on being paranoid, add a script to remove all data-testid attributes from your code at build time so they don't show up in production builds.

#5 - A waste of development time

My answer to this is always simple: I'll own it.

I'm the QA, I'm the one responsible for test code quality, so I'll happily take the ownership of including test ids where I want them for automated tests.

This answer might bother you if you are uncomfortable editing feature code, but in my experience, it's the easiest way to get buy-in from the team on using test ids, because they don't have to change how they work.

It takes a few seconds to add test ids yourself and create a PR, and it's the kind of PR that needs a "LGTM" and a quick approval because it's not hard to review.

The second easiest way -- assuming you don't have permission to create PRs that touch feature code -- is to do the "annoying" part and identify which elements should have test ids before development begins, and request that your devs add them. You can even tell them what names to give the test ids so they are just copy-pasting.

I can't say this works in all teams, though, and that's because of an important caveat I must mention, and it's something you've probably already thought of while reading this...

An important caveat

I realize I'm writing this from the privileged perspective of an in-house QA Engineer embedded in the development team who has permission to modify feature code.

You might be asking yourself -- and rightly so -- what if we don't have control over the codebase? or what if my team refuses to use test ids?

Many people are skeptical of using test ids, and some never use them as a rule. We can't change everyone's minds, and that's OK. We don't need to.

As the objections point out, there are plenty of viable alternatives to test ids. They are inherently less reliable, but even if you have to use an XPath, it's not the end of the world.

You can learn to write XPaths that are more stable than the auto-generated ones you may have seen (you know, the mile-long ones that start at the root of the document?).

You can use the role and other "user-facing" strategies. They really are the next best thing, there's a reason people like them.

There are options for everyone, even if you're given the least accessible codebase on the planet.

But on the off-chance you do have control over your team's codebase, and you are willing to modify feature code to save your team some time, you should use a data-testid so your tests break when your assertions fail, not when your locators fail to find a DOM node.