It’s so frustrating to keep Hearing About RCTs Comparing Robotics to Manual Surgery

Steve Bell
Apr 14
8 min read

Steve Bell questions the role of RCTs in robotics vs manual surgery

I read today yet another RCT showing how robotic TKA (total knee arthroplasty) showed no benefit when looking at robotic vs manual guided. So the conclusion was that robots are a waste of money. But for me that study was just another flawed RCT of “golden hands” vs “golden hands” - what did they expect?

I thought we’d stopped this a long time ago.

There’s a familiar refrain in surgical robotics: “Show me the RCT proving it’s better than manual surgery.”

On the surface, it sounds a reasonable request. But for me, I feel it’s increasingly the wrong question. Wrong statement.

I believe (and have for a long time), Randomised Controlled Trials are built to answer a very specific thing: can two approaches or technologies achieve similar outcomes under controlled conditions? But that’s not the decision medicine is making anymore. The real question isn’t whether a robot can beat the very best surgeon on their very best day. It’s whether a “system” can deliver safe, consistent, reproducible outcomes across the full distribution of surgeons, hospitals, and patients.

These are two very different questions that require very different approaches.

And (importantly) can robotics remove that long tail on the bell curve of those groups of surgeons that are making the majority of the mistakes?

And that’s where I think the RCT framing starts to break down in comparing techniques.

Stay with me. Let’s make a simple thought experiment. How about we run an RCT comparing the most elite surgeons performing perfect hand-sewn anastomoses versus average surgeons using staplers. In the best hands (and data shows this), manual suturing may match (or even outperform) the stapler. The best surgeons - that elite 0.5% can make stunning anastomoses - nearly every time.

But is anyone seriously proposing a return to a fully hand-sewn world. So that all comers - of all skills of surgeons are let loose on manual anastomoses?

Why does this just feel odd to you? Because in the real world (not highly controlled RCTs) medicine doesn’t optimise for exceptional individuals. It optimises for systems that reduce variability, shorten learning curves, and deliver consistent results at scale. At Scale !

We can’t be comparing the best forty year experienced veterans across two techniques. We need to look at inexperienced, low volume, average surgeons and see which technology or technique gives them better outcomes. - Which technolovgygets them to proficiency faster. Which technology reduces errors.

I guarantee you - middle of the road surgeons or very junior surgeons are not sat in their hospitals running RCTs on themselves to compare outcomes.

The same logic played out with laparoscopic surgery back in the 90s. I know… I was there. Early on, outcomes were not uniformly superior. There were complications lots of complications… There was a learning curve. If you had run an RCT pitching the best open surgeons against early laparoscopic adopters, you could easily have argued against adoption. (Many tried with references to chopstick Nintendo.) But most of those RCTs were simply ignored as market forces and reality drove the shape of surgery.

So laparoscopy took over. Why?

Not because it was instantly better in every metric measured in RCTs, but because it represented a platform shift to less invasive access, faster recovery, and a trajectory of improvement that fundamentally changed how surgery could be delivered. No - not by the elite. But by every surgeon at every level. The system change allowed them to get better outcomes. The RCT results would have killed that progress.

(The key lesson I feel that was learned was about training...)

In my head, robotics sits in that same category. (It is getting less: but I’ve seen three papers in two weeks arguing robots don’t give any advantage so are a waste of money.)

But here’s where I think it gets even more misleading; and frankly, a bit absurd.

We often see RCTs designed with “golden hands” on both sides: world-class laparoscopic surgeons versus world-class robotic surgeons. World class manual knee surgeons vs world class robotic knee surgeons And then people act surprised when the outcomes look… similar. Why would they not look similar? The trial is set up to show “no difference." They are not world class manual surgeons because they have bad results...

If you take the top 1% of surgeons in any modality and let them operate in ideal conditions, most modern techniques will converge. That’s what expertise does, it compresses differences. But designing a trial around that scenario answers a very niche question: how do two technologies perform when the human variable is already maxed out?

That is not how real world healthcare works.

The real world is not populated by elite operators working in perfect environments. It is populated by variability, different skill levels, different hospitals, different levels of fatigue, different support systems. By anchoring RCTs around “golden hands vs golden hands,” we strip out the very thing these technologies are meant to address: inconsistency.

So we end up proving something technically correct and strategically useless.

The uncomfortable truth is this: if your RCT has “golden hands” on one arm (and especially if it has them on both arms) it’s already answering the wrong question. You are no longer comparing technologies or techniques in any meaningful sense. You are comparing what happens when variability has been artificially removed. You are not looking at real world populations.

If there’s one thing I’ve seen in my 35 years: is that medicine does not scale around the outliers.

Robotics doesn’t need to outperform the top 1% of laparoscopic surgeons. It needs to outperform the median experience; on a Tuesday afternoon, in a regional hospital, with all the messiness that real-world practice entails. It needs to reduce fatigue, enhance precision, standardise workflows, and create a platform that can be trained, repeated, and improved over time.

Don’t underestimate that introducing a robot introduces quality and consistency of the equipment. There is massive value in that.

The technology needs to level the playing field. The training systems around it need to increase speed to proficiency. That is not something a traditional RCT is well designed to capture. It can’t capture very well that messiness of the real world.

At some point, insisting on ever more RCTs comparing robotics to manual surgery starts to look like asking whether we should revisit older methods simply because they can still perform beautifully in expert hands. It’s a backward-looking question in a forward-moving system.

I think it is worse, by adding a layer of confusion when non-experts read the results (not that most surgeons read the clinical journals for best practice.)

Honestly, for me - the train has already left the station. The genie is too far out of the bottle and not going back in. It is not now to use RCTs to compare robots to manual. It is to understand who gets the most advantage from a robot. And then how do you find the money to give those surgeons that advantage from the robot.

Because that investment will most likey bring the biggest returns.

The irony is that the less expert big names out in the regional centres have to fight harder to get the money to get the tech that gives them the best improvement. The big name experts in top institutions that are the experts that don’t really need it - get it ! Go figure that out.

The relevant debate now isn’t whether robotics can match laparoscopy under idealised conditions with elite surgeons on both sides. Or is robotic TKA superior to hand navigation in the hands of experts. It’s how fast it will redefine the baseline of care across everyone else; and who gets left behind waiting for a perfect trial that was never designed to answer the real question.

Because history is fairly clear on one point: Medicine doesn’t go backwards just because the past still looks good in the hands of the exceptional.

Now. I’m not saying that initial and ongoing evidence is not needed - but what I’m arguing is that highly devised RCTs of Golden hands vs Golden hands will show you nothing but an elitist group that decades into their career can do either technique masterfully.  I am much more for the real world data, real world evidence. And I care most about individual to individual comparisons. How do I stack up against myself with Tech A vs Tech B.

  It is pointless me comparing my driving to a formula 1 driver. What does that even show? Instead I want to know if I drive better, safer, with less fatigue if I have driver assist vs no driver assist. How do I feel? How is my driving measured? Am I more fuel efficient? Do I have less near misses? Less crashes? With the technology vs without the technology. 

And this is the big one… should we be having learner drivers look at studies of Formula 1 drivers with and without cruise control and saying that is representative of them. Especially if it shows F1 drivers on the road absolutely don’t get benefit from driver assist? F1 drivers get the same outcomes if they have driver assist features or not. So QED driver assist is a waste of money.

  Make sense?

Instead I personally want to see if I give a junior surgeon, or poorly skilled surgeon, or low volume surgeon a robot - with all the training tools simulators - imaging - 3D - wrists - safety assist features - assistive features - consistency of kit etc etc etc etc.  Do they perform better than if they use 2D straight stick laparoscopy. Isn't that the comparison that is relevant? Isn’t that the problem we are trying to actually solve for? How do the current RCTs tackle that? (I know there are a limited number in this direction, but they don't make the headlines.)

I am not convinced everyone has fully understood the power (and the speed of feedback) of things like da Vinci insights, or Versius Connect to get to these answers. Being able to see individual performance and progress in a real world environment is surely what we need to look at - not RCTs of experts? Or am I just missing something?  

And of course they need to link to outcomes and EHRs - and all that. We have to be able to measure real outcomes. But don’t we want to measure technology on the impact on the lowest common denominator - not the highest? For me it's a bit like having my own Starva and seeing my own performance and gains rather than reading articles about the top cyclists in the world getting 0.5% gain from having this aero bike vs that one. Then concluding it's hardly worth it. I care about my perfomance and does that tech make me faster. And how that tech makes me 40% faster in a climb. (bad example?)

I do agree with many on registries and think registries give a better view into the real world than RCTs on certain metrics. It gives early insight into safety as whole - and I think they can offer more insights if you have registries of lap vs robotic vs open. I think that data for “all comers” can give interesting insights.   But I’m still convinced that looking at the impact on an individual level. At the average user. Is where we will see if any technology is bringing the real value. Surgeon - department - hospital level. Their own ground truth.

Anyhow in 2014 people were asking "Why a robot" Today 2026 they are asking "Which robot do i chose... and how do we pay for it."

Maybe those RCTs saying robots are a waste of time and money have just become less relevant in real world decision making?

These are just musings and thoughts for education purposes only.

It’s so frustrating to keep Hearing About RCTs Comparing Robotics to Manual Surgery

I thought we’d stopped this a long time ago.

So laparoscopy took over. Why?

If there’s one thing I’ve seen in my 35 years: is that medicine does not scale around the outliers.

Because history is fairly clear on one point: Medicine doesn’t go backwards just because the past still looks good in the hands of the exceptional.

Recent Posts

Get the Ultimate Medtech Survival Guide now