Here’s why ChaptGPT and other AI models may not always improve over time

Here’s why ChaptGPT and other AI models may not always improve over time
redit: Unsplash/ Andrew Neel

When OpenAI released its latest text-generating artificial intelligence, the large language model GPT-4, in March, it was very good at identifying prime numbers. When the AI was given a series of 500 such numbers and asked whether they were primes, it correctly labeled them 97.6 percent of the time. But a few months later, in June, the same test yielded very different results. GPT-4 only correctly labeled 2.4 percent of the prime numbers AI researchers prompted it with—a complete reversal in apparent accuracy. The finding underscores the complexity of large artificial intelligence models: instead of AI uniformly improving at every task on a straight trajectory, the reality is much more like a winding road full of speed bumps and detours.

Follow the latest news and policy debates on sustainable agriculture, biomedicine, and other ‘disruptive’ innovations. Subscribe to our newsletter.

Even OpenAI has acknowledged that, when it comes to GPT-4, “while the majority of metrics have improved, there may be some tasks where the performance gets worse,” as employees of the company wrote in a July 20 update to a post on OpenAi’s blog. Past studies of other models have also shown this sort of behavioral shift, or “model drift,” over time. That alone could be a big problem for developers and researchers who’ve come to rely on this AI in their own work.

This is an excerpt. Read the full article here

{{ reviewsTotal }}{{ options.labels.singularReviewCountLabel }}
{{ reviewsTotal }}{{ options.labels.pluralReviewCountLabel }}
{{ options.labels.newReviewButton }}
{{ userData.canReview.message }}

Related Articles

Infographic: Global regulatory and health research agencies on whether glyphosate causes cancer

Infographic: Global regulatory and health research agencies on whether glyphosate causes cancer

Does glyphosate—the world's most heavily-used herbicide—pose serious harm to humans? Is it carcinogenic? Those issues are of both legal and ...

Most Popular

Screenshot-2026-04-20-at-2.26.27-PM
Viewpoint — Food-fear world: The latest activist scientists campaign: Cancer-causing additives
Screenshot-2026-05-01-at-11.56.24-AM
‘Science moves forward when people are willing to think differently’: Memories of DNA maverick Craig Venter
Screenshot-2026-03-13-at-12.14.04-PM
The FDA wants to make many popular prescription drugs OTC—a great idea. Here’s why it’s unlikely to happen
Screenshot-2026-04-03-at-11.15.51-AM
Paraben panic: How a flawed study, media hype, and chemophobia convinced the public of the danger of one of the safest classes of preservatives
ChatGPT-Image-May-1-2026-02_20_13-PM
How RFK, Jr.’s false vaccine claims are holding up $600 million to fight diseases in poor countries
viva-la-vida-watermelons
Misinformation and climate change are endangering summer watermelons
Screenshot-2026-04-30-at-2.19.37-PM
5 myths about summer dehydration that could damage your health — or even kill you
ChatGPT-Image-Mar-27-2026-11_27_05-AM
The myths of “process”: What science says about the “dangers’ of synthetic products and ultra-processed foods
Drinking lots of water can help reduce the effects of aging
Nanoplastics in drinking water: MAHA activists forge science-based bipartisan coalition 
ChatGPT-Image-Mar-10-2026-01_39_01-PM
Viewpoint—“Miracle molecule” debunked: Why acemannan supplements don’t work
79d03212-2508-45d0-b427-8e9743ff6432
Viewpoint: The Casey Means hustle—Wellness woo opportunism dressed up as medical wisdom
ChatGPT-Image-Apr-30-2026-12_21_05-PM-2
The tech billionaires behind the immortality movement
Screenshot-2026-05-04-at-12.54.32-PM
How Utah became the country’s supplement capital  — and a haven for unregulated, ineffective and fake products
glp menu logo outlined

Get news on human & agricultural genetics and biotechnology delivered to your inbox.