Skip to content(if available)orjump to list(if available)

Microsoft's new Dragon Copilot is an AI assistant for healthcare

dumbmrblah

I’ve been beta testing this for several months. It’s OK. The notes it generates are too verbose for most medical notes even with all the customization enabled. Most medical interviews jump around chronologically and Dragon Copilot does a poor job of organizing that, which means I had to go back and edit my note which kind of defeated the purpose of the app in the first place.

It does a really good job with recognizing medications though, which most-patients butcher the name on.

Hallucinations are present, but usually they’re pretty minor (screwing up gender, years).

It doesn’t really seem to understand what the most important part of the conversation is, it treats all the information equally as important when that’s not really the case. So you end up with long text of useless information that the patient thought was useful but not at all relevant to their current presentation. That’s where having an actual physician is useful to parse through what is important or not.

At baseline it doesn’t take me long to write a note so it really wasn’t saving me that much more time.

What I do use it for is recording the conversation and then referencing back to it when I’m writing the note. Useful to “jog my memory” in a structured format.

I have to put a disclaimer in my note saying that I was using it. I also have to let the patient know upfront that the conversation is getting recorded and I’m testing something for Microsoft, etc. etc. You can tell who the programmer patients are because they immediately ask if it’s “copilot“ lol

amoxichillin

I've been helping test it as well - your experience sounds identical to mine. I was initially very excited for it, but nowadays I don't really bother turning it on unless I feel the conversation will be a long one. Although I am very much looking forward to them rolling out the automated pending of orders based on what was said during the conversation.

LLM's have so much potential in medicine, and I think one of the most important applications they will have is the ability to ingest a patient's medical chart within their context and present key information to clinicians that would've otherwise been overlooked in the bloated mess that most EMR's are nowadays (including Epic).

There's been so many times where I've found critically important details hidden away as a sidenote in some lab/path note overlooked for years that very likely could've been picked up by an LLM. Just a recent example - a patient with repeated admissions over the years due to severe anemia, would usually be scoped and/or given a transfusion without much further workup and discharged once Hgb >7. Blood bank path note from 10 years ago mentions presence of warm autoantibodies as a sidenote; for some reason the diagnosis of AIHA is never mentioned nor carried forward in their chart. A few missed words which would've saved millions of dollars in prolonged admissions and diagnostic costs over the years.

stuartjohnson12

I just wanted to jump in and say - don't give them too much credit on transcribing medication, I'm guessing this is Deepgram behind the scenes and their medication transcription works pretty well out of the box in my experience.

voidUpdate

Screwing up gender and years sounds pretty serious to me?

beng-nl

Maybe they mean that it either doesn’t matter in context or it’s easy to catch and correct. Either way it seems reasonable to trust the judgement of the professional reporting on their experience with a new tool.

stvltvs

I worry that we'll get complacent and not check details like that when they are important, not just the medical field but everywhere.

userbinator

The notes it generates are too verbose for most medical notes even with all the customization enabled.

I've noticed that seems to be a common trend for any AI-generated text in general.

TeMPOraL

I think this might be because of what GP said later:

> it treats all the information equally as important when that’s not really the case

In the general case (and I imagine, in the specific case of GP), the model doesn't have any prior to weigh the content - people usually just prompt it with "summarize this for me please <pasted link or text>"[0], without telling it what to focus on. And, more importantly, you probably have some extra preferences that aren't consciously expressed - the overall situational context, your particular ideas, etc. translate to a different weighing that the model has, and you can't communicate that via the prompt.

Without a more specific prior, the model has to treat every information equally, and this also means erring on the side of verbosity, as to not omit anything the user may care about.

--

[0] - Or such prompt is hidden in the "AI summarizer" feature of some tool.

gmerc

Are they charging per token

pintxo

Did you encounter any instances of hallucinations or omissions?

One would image those to be the biggest dangers.

dumbmrblah

Hallucinations are pretty minimal but present. Some lazy physicians are gonna get burned by thinking they can just zone out during the interview and let this do all the work.

I edited my original post. Omissions are less worrisome, it’s more about too much information being captured which isn’t relevant. So you get these super long notes and it’s hard to separate the “wheat from the chaff”.

bluefirebrand

Seems like capturing too much irrelevant detail would be preferable to potentially missing important details, though?

zora_goron

This doesn’t necessarily apply to this particular offering, but having working in clinical AI previously from a CS POV and currently from as a resident physician, something I’m a little wary of is the “shunting” of reasoning away from physicians to these tools (implicitly). One can argue that it’s not always a bad thing, but I think the danger can lie in this happening surreptitiously by these tools deciding what’s important and what’s not.

I wrote a little bit more of my thoughts here, in case it’s of interest to anyone: [0]

On that same vein, I recently made a tool I wrote for myself public [1] - it’s a “copilot” for writing medical notes that’s heavily focused on letting the clinician do the clinical reasoning, with the tool exclusively augmenting the flow rather than attempting to replace even a little bit of it.

[0] https://samrawal.substack.com/p/the-human-ai-reasoning-shunt

[1] https://x.com/samarthrawal/status/1894779710258733330

eig

As a medical student, I used the dragon dictation software (no AI) to write notes in the ED and more recently I used a pilot of this ai version to write clinic notes.

Overall, I was quite impressed. It definitely made writing notes much faster, which all doctors hate to do. While it had some problems with where to put key pieces of information (like putting details from the physical exam back in the history), it only took 5 mins of rearrangement after the visit to complete the note.

For simple diagnoses, it does a decent job coming up with the assessment and plan, probably because all the simple diagnoses were in the training set. For more complex ones though, it needs to be exactly dictated by the doctor. I can see this being used very well in primary care.

Edit: When I said “coming up with an assessment and plan” I mean documenting the assessment and plan based on the ai’s recorded conversation with the patient. The conversation with the patient is meant to be understandable. The “assessment and plan” documentation on the other hand is jargony and meant to be read by other physicians.

conartist6

This still sounds bad. 5 mins to rework your notes after each patient visit? I didn't assume doctors had that kind of time.

And let me make this clear. I, as your patient, I never NEVER want the AI's treatment plan. If you aren't capable of thinking with your own brain, I have no desire to trust you with my health, just like I would never "trust" an AI to do any technical job I was personally responsible for due to the fact that it doesn't care at all if it causes a disaster. It's just stochastic word picker. YOU are a doctor.

diggan

> This still sounds bad. 5 mins to rework your notes after each patient visit? I didn't assume doctors had that kind of time.

Compared to what though? It reads as not additional work, but less work than manually having to do all that, seems likely to needing more than 5 minutes.

> And let me make this clear. I, as your patient, I never NEVER want the AI's treatment plan.

Where are you getting this from? Neither the parent's comment nor the article talks about the AI assistant coming up with a treatment plan, and it seems to be all about voice-dictating and "ambient listening" with the goal of "free clinicians from much of the administrative burden of healthcare", so seems a bit needlessly antagonistic.

conartist6

If you should ever couch its knowledge as your knowledge, I would think you could be in serious trouble. You would have to say something like "the AI's plan to treat you, which I think might be correct", when what I want to hear "my plan to treat you is: ..."

But I think it's more subtle than that, because I expect the AI to reinforce all your biases. Whatever biases (human biases, medical biases, biases that arise from what a patient isn't telling you) go into the question you feed it, it will take cues you didn't even know you were giving and use those cues to formulate the answer it thinks you expect to hear. That seems really dangerous to me, sort of like you're conceptually introducing AI imposter doctors to the staff, whose main goal is act knowledgable all the time so people don't think they are imposters...

I dunno. I'd like to give this particular strain techno-futurism back. Can I have a different one please?

rsynnott

From the post they're replying to:

> For simple diagnoses, it does a decent job coming up with the assessment and plan

(Somewhere, a medical liability insurance actuary just woke up in a cold sweat)

Yeah, personally I'd be looking for a second opinion.

ceejayoz

The AI companies absolutely hope to be the ones to come up with the treatment plans eventually.

ilikecakeandpie

> 5 mins to rework your notes after each patient visit? I didn't assume doctors had that kind of time.

I worked in a healthcare for over a decade (actually for a company that Nuance acquired previous to their acquisition) and the previous workflow was they'd pick up a phone, call a number, say all their notes, and then have to revisit their transcription to make sure it was accurate. Surgeons in particular have to spend a ton of time on documentation

zeagle

AI is an assistive tool at best but it can probably speed up by reflowing text. I use dragon dictation with one of the Philips microphones and it makes enough mistakes that I would probably spend the same time editing/proofing. Had a good example yesterday where it missed a key NOT in an impression.

As aside, the after work is what burns out physicians. There is time after the visit to do a note, 5 min for a very simple is reasonable to create dictate fax do the work flow for billing and request a follow up within a given system. A new consult might take 10 min between visits if you have time.

For after hours, ER is in my opinion a bad example because when you are done, you are done.

Take a chronic disease speciality or GP and it is hours of paperwork after clinic to finish notes (worse if teaching students), triage referrals, deal with patient phone calls that came in, deal with results and act in them, read faxes etc. I saw my last patient ~430 yesterday and left for home at 7 dealing with notes and stuff that came in since Thursday night.

eig

I think you may be misunderstanding how the tool is used (at least the version I used).

The doctor talks to the patient, does an exam, then formulates and discusses the plan with the patient. The whole conversation is recorded and converted to a note after the patient has left the room.

The diagnosis and plan was already worked out while talking to the patient. The ai has to convert that conversation into a note. The ai cant influence the plan because the plan was already discussed and the patient is gone.

Ukv

> And let me make this clear. I, as your patient, I never NEVER want the AI's treatment plan. If you aren't capable of thinking with your own brain, I have no desire to trust you with my health,

To my understanding this tool is for transcription/summarization, replacing administrative work rather than any critical decision making.

> just like I would never "trust" an AI to do any technical job

I'd trust a model (whether machine-learning or traditional) to the degree of its measured accuracy on the given task. If some deep neural network for tumor detection/classification has been independently verified as having higher recall/precision than the human baseline, then I have no real issue with it. I don't see the sense in having a seemingly absolute rejection ("never NEVER").

bpodgursky

> I, as your patient, I never NEVER want the AI's treatment plan.

You as a patient are going to get an AI treatment plan. Come to peace with it.

You may have some mild input as to whether it's laundered through a doctor, packaged software, a SaaS, or LLM generated clinical guidelines... but you're not escaping an AI guiding the show. Sorry.

mbb70

It does feel like we are hurtling towards a world where every industry will have a high volume producer of generated content, which will force the creation of a high volume summarizer of generated content.

"Having trouble processing a medical claim with 50+ pages of notes? Not to worry, Dragon Copilot Claim Review(tm) trims the fluff and tells you what really happened!"

"Having trouble understanding a large convoluted PR? Not to worry, Copilot(tm) Automated Review has your back!"

"Having trouble decided which cordless vacuum to buy? Not to worry, Amazon's Customers Say(tm) shows you what people think!"

There is definitely _some_ world utility to this arms race, but is it enough?

jmward01

The truth is these tools are coming. There are teething pains, but they will be the norm in a year or two. The real question is what does healthcare look like in 5-10+ years as deep knowledge tools start entering it and disrupting every step of the patient journey? I have hopes that it will bring medicine more local and personal again but I have fears that the productivity gains and cheap intelligence will just be used to strip resources out of healthcare for profit.

MangoCoffee

I'm not sure why some people are so hostile to this tool. It sounds like Dragon Speak plus AI. It's not going to replace your doctor.

DebtDeflation

This sounds like a basic STT/Transcription app. What makes it a "Healthcare Virtual Assistant"? Presumably it's been trained on a medical dictionary to recognize vocabulary from this domain? Dragon has been making transcription apps since 1997, originally based on Hidden Markov Models, I assume since updated to use transformers.

davikr

Interesting, but there is a lot of "intent" in writing notes and I am not convinced it could capture the full picture without significant human supervision. Would it really save time writing paperwork if you have to go through it anyways and check if there's anything wrong? At least when I write, I know it's correct.

AtreidesTyrant

Didn't this have issues recently where symptoms or stories were hallucinated and attributed to the patient?

This seems like a tool that insurance companies would love to get a copy of the data stream, and that could get very sticky quite quickly.

tantalor

Not surprisingly there is a lot of competition in AI medical scribe software.

Some other companies in this space are Epic, Freed, Nuance, DeepScribe, Nabla, Ambience, Tali, Augmedix

Closi

This is Nuance, Microsoft acquired them in 2021 :)