There's Something Odd About the Official Playwright MCP Demo

In the demo, the presenter prompts an MCP server to perform an exploration of a web app they built that can look up different movies and generate one test.

The agent starts by finding a search button and choosing to perform a check that it works. It chooses a movie title — "star wars". Great taste.

When searching for star wars, a completely different movie appears with a different thumbnail and description. The presenter is under the impression that the MCP found an edge case or something that she had missed, and wrote some tests to uncover that issue.

That's not what appears to have happened.

The MCP server did not notice that the wrong movie was shown — we can see that by its summary and the tests it creates. It also didn't ask the presenter to update the incorrect file either, because it's still there.

The point is not that the presenter did a bad job or that Playwright MCP is a load of crap.

The presenter did a great job, and Playwright is a great tool.

The point is that working with AI often leads us astray. It moves much quicker than we can think critically, and that's a problem if we want to truly assess quality.

130,000 people have watched the video, but I bet only a handful noticed that beyond the awe, the MCP created a pretty useless test.

It looks like the MCP server found an issue and acted on it — but it didn't. It was the human that noticed it. Funny, that.

The MCP server then wrote a passing test, rather than a failing test — which would have been the better way to surface an issue and get a developer to fix it.

Speed is seductive. But quality requires pause.

There's Something Odd About the Official Playwright MCP Demo

The Shedding Skin Heuristic: How Zombie APIs and Missed Dependency Updates Hide in Plain Sight

The Dark Side of Agentic AI: Are We Ready for What's Coming?

BINMEN: A Practical Heuristic for API Testing

API Testing Mnemonics: CRUD, BINMEN, VADER & POISED Explained

Don't think of an elephant

There's Something Odd About the Official Playwright MCP Demo

I was wrong about exploratory testing, are you?

The perpetual stew vs the historian

Pushback on crappy testing interviews.

Common misconceptions about Scrum

AI has got our wires crossed

How are we still doing Taylorism in 2025

Testing practice: Irish phone numbers

Forget flashy - focus on fundamentals in testing

Have you had too much to think?

Setting expectations for tester during agile ceremonies

Thoughts on Estimates in Software Engineering

Rating testing deifnitions from different orgs

Testing Financial data using an API

Tales from Reddit: testing doesn't exist

My Accidental Vibe Coding Nightmare