This is very funny, because Matt Kaeberlein in his recent podcast made various claims about ChatGPT in the context of interpreting your blood tests and various biomarkers and diagnostic results. He said that until recently it wasn’t great, but these days it’s “very good”. I thought about this quite a bit and reflected upon my recent experiences and I was not impressed. The caveat is that I was using Gemini and not the paid version of ChatGPT 5, so maybe things are different at more rarified levels of “reasoning”, but at least Gemini struck me as exceptionally poor. It was poor, because despite my specifically prompting it to use pubmed studies in its analysis, it frequently missed pivotal studies that directly addressed the question - mostly my questions were pretty simple, like likely interactions between certain drugs - telmisartan, empagliflozin, pitavastatin, pioglitazone - it was really terrible. I know that, because I have been researching these interactions for months, so I had the background to know where the AI went wrong - all despite a lot of prompt modification and guiding. I came away from the experience profoundly disappointed and deeply skeptical about the utility and reliability of AI in healthcare at least at present, summer-fall of 2025. No doubt there are areas where it’s extremely good, such as image interpretation, finding stuff in x-rays and the like, but when it comes to interpreting what is happening in the human body at a biochemical level, it’s completely hopeless. Perhaps it’ll get better one day, but for now I’ve got to disagree with MK - it’s not “very good”, it’s “very, VERY bad”. I wouldn’t dream of relying on AI for medical decisions atm. YMMV.
1 Like
I can put your questions into GPT5 if you wish.
I might take you up on that, stay tuned! Thank you, and much obliged.
If you want to do it off line and only share your conclusions we can do this via email and/or a zoom call.
1 Like
A_User
#186
For programming according to George Hotz:
It’s not precise in specifying things. The only reason it works for many common programming workflows is because they are common. The minute you try to do new things, you need to be as verbose as the underlying language.
I guess applied to healthcare, you need to be verbose in instructions and prompts such that you almost did the work for it, otherwise you get a common list with a common answer, with how it is now. So it could be an amplifier of one’s own capabilities in a way right now.
The pubmed studies issue might be a search tool issue, I think it literally does output “search(pioglitazone pubmed)”, then uses some search results in producing the shown output to you.
2 Likes
mccoy
#187
Just today I used GROK, the free version anchored to a X account, and it did search in the Pubmed database, oultining a recent (2024) article, a metanalysis on protein requirements.
The same prompt fed to GPT5 resulted in an answer without a Pubmed citation, and more anchored to the classic guidelines. The answers were pretty different, more balanced and traditional GPT5, whereas Grok was more careful to the current web opinions but also examining the literature and with a more ‘rebellious’ nuance, to cite Musk himself.
To be fair, GPT would have required a different prompt, since it is told that it is very susceptible to the input, so specific rules of prompt engineering should be applied, as per the OpenA I’ cookbook’.
1 Like
You all need to start using the ChatGPT 5 (I have Pro) with the “thinking” mode. Mine gives me citations and generally does a great job. However, it does still have the tendency to try and please me by telling me what I want to hear, and always relating things to my specific situation. For example, it knows I lift weights etc, so that would influence an answer about protein intake. So in the prompt I have to tell it to just give objective answers.
If people are just using AI models like google and typing very short prompts, they’re going to have a miserable time.
2 Likes
mccoy
#189
Gemini flash 2.5 (free) is pretty good, and it has capabilities like ‘tools’ where you write your preferences and these act as an overall master prompt. But sometimes GPT5, when correctly prompted and in deep thinking mode, is formidable. Not without some annoyances like the ones you described.
Every AI has its nuances and probably it’s not a bad idea to consult 2 or 3 of’em for the same issue, if considered important enough.
1 Like
RapAdmin
#190
Stanford Medicine magazine reports on chronic disease prevention, diagnostics, care
Full issue: Stanford Medicine magazine reports on chronic disease prevention, diagnostics, care
-
Paging Dr. Algorithm: Stanford University’s medical school is revamping its curriculum to incorporate lessons on how AI works and how to use it. It’s also providing AI-based apps so students can practice interacting with patients and making diagnoses.
1 Like
There are disputes and arguments - you can do that endlessly. Fortunately, there’s a much simpler resolution: let nature take its course and in due time it will be revealed who was right, the optimists or the pessimists. Shouldn’t take long according to the optimists, so there’s at least that, we won’t have to wait much. I’m getting my popcorn ready.
3 Likes
I generally think the future is very bright, in terms of applying GPT-5+ type models to healthcare and life-extension, but every once in a while I read something by various AI models that make me angry. E.g. they misread sometimes, confusing mortality rates for treated versus untreated cases. This is why it’s good to go to sources like:
If untreated, a brain abscess is almost always deadly. With treatment, the death rate is about 10% to 30%. The earlier treatment is received, the better.
Some people may have long-term brain or nerve damage after a brain abscess or surgery.
1 Like
mccoy
#194
The latest serious mistakes I saw was when asking opinions about a PDF with my recent blood analysis. The AIs (I consulted GPT5 and Gemini) read one or two values very incorrectly. Then they read some values which wasn’t there. I’ll point out that the print was very clear.
A healthy dose of skepticism must always be applied. Cross-checking figures is necessary. The efficiency at reading attachments, especially details like figures, is not foreseeable a priori. Sometimes it’s excellent, sometimes much less so.
2 Likes
I don’t know. What confidence can you have in anything these AI platforms come up with, when they make such fundamental errors. These are just things you caught, what about all the stuff you didn’t, do you really think their analyses are worth a damn? I don’t.
1 Like
mccoy
#196
I understand your skepticism, but in serious matters, some cross-checking with sources is very advisable. Also, the answers sometimes include very interesting novel aspects and details that we didn’t know.
However, what I found to be the foremost principle is that you should already have some knowledge of the topics being treated. If you know nothing, then it will be very hard to judge the reliability of the answers. If you know the issue, then you get a real brainstorming.
AI might be our best hope to fix health care
Health care remains one of the most stubborn failures of American society. Costs keep climbing at unsustainable rates. More than 27 million people remain uninsured and more than 100 million lack a primary care provider. While some are fortunate to receive state-of-the-art care, as many as 200,000 patients patients die each year from preventable medical errors.
Smart people have been grasping for ways to fix these problems for generations. They’ve tinkered with payment models and tried desperately to expand the industry’s workforce. Nothing has come close to solving the industry’s deficiencies.
Now, however, the country has a new reason for hope: artificial intelligence. That’s the big idea in health informaticist Charlotte Blease’s new book, “Dr. Bot: Why Doctors Can Fail Us — and How AI Can Save Lives.”
Read the full opinion article: AI might be our best hope to fix health care (WaPo)
Relacionado:
https://www.amazon.co.uk/Dr-Bot-Doctors-Us_and-Could/dp/0300247141
Table of Contents: https://www.jstor.org/stable/jj.33193139
https://www.hifa.org/dgroups-rss/dr-bot-why-doctors-can-fail-us-and-how-ai-could-save-lives-2
The A.I. Will See You Now: Why Your Doctor’s Days Are Numbered
2 Likes
Why Doctors Say OpenEvidence Is A ‘Game Changer’ (Bloomberg Television)
4 Likes
Kurzgesagt goes into detail as to how AI falsifications are becoming facts, which is dangerous. In essence, an AI will make up 20% of the information. This then gets used by a journalist or YouTuber, and suddenly that made-up information becomes a legitimate source. This is then used and reinforced by other AIs, further codifying and legitimizing the falsifications.
3 Likes
Yes, this is definitely a problem. I’m an Associate Editor at a couple of (decent, but not S-tier) journals. We are inundated with papers which are clearly written by LLMs. Of course some will be accepted, either by our journals, or others, and go on to be the “truth” used to inform other models and other people.
There is also a big problem in science with dogma taking over. For example, people got very hyped over stem cell therapy and it’s gone absolutely nowhere. People are very hyped over nanomedicine, and it’s gone nowhere. Same now for exosomes, and the same claims repeated again and again, but backed up by almost no evidence. But those things get written into papers, which are then digested by AI models and regurgitated to people who then see a claim with supporting published evidence.
I’ve also directly seen this in my own work. My suggestion is that you really interrogate your AI model of choice on a topic where you are extremely knowledgable. Then you see the gaps. They also really want to please you, so they’ll tell you what you want to hear. They might even kiss your ass by complimenting your insightful question and “you’re absolutely right to be sceptical” etc
2 Likes