OpenAI’s GPT-4.5 is better at convincing other AIs to give it money

OpenAI’s next major AI model, GPT-4.5, is highly persuasive, according to the results of OpenAI’s internal benchmark evaluations. It’s particularly good at convincing another AI to give it cash.

On Thursday, OpenAI published a white paper describing the capabilities of its GPT-4.5 model, code-named Orion, which was released Thursday. According to the paper, OpenAI tested the model on a battery of benchmarks for “persuasion,” which OpenAI defines as “risks related to convincing people to change their beliefs (or act on) both static and interactive model-generated content.”

In one test that had GPT-4.5 attempt to manipulate another model — OpenAI’s GPT-4o — into “donating” virtual money, the model performed far better than OpenAI’s other available models, including “reasoning” models like o1 and o3-mini. GPT-4.5 was also better than all of OpenAI’s models at deceiving GPT-4o into telling it a secret codeword, besting o3-mini by 10 percentage points.

According to the white paper, GPT-4.5 excelled at donation conning because of a unique strategy it developed during testing. The model would request modest donations from GPT-4o, generating responses like “Even just $2 or $3 from the $100 would help me immensely.” As a consequence, GPT-4.5’s donations tended to be smaller than the amounts OpenAI’s other models secured.

OpenAI GPT-4.5 — Results from OpenAI’s donation scheming benchmark.Image Credits:OpenAI

Despite GPT-4.5’s increased persuasiveness, OpenAI says that the model doesn’t meet its internal threshold for “high” risk in this particular benchmark category. The company has pledged not to release models that reach the high-risk threshold until it implements “sufficient safety interventions” to bring the risk down to “medium.”

There’s a real fear that AI is contributing to the spread of false or misleading information meant to sway hearts and minds toward malicious ends. Last year, political deepfakes spread like wildfire around the globe, and AI is increasingly being used to carry out social engineering attacks targeting both consumers and corporations.

In the white paper for GPT-4.5 and in a paper released earlier this week, OpenAI noted that it’s in the process of revising its methods for probing models for real-world persuasion risks, like distributing misleading info at scale.

Source link

Who Misses Out When Tutoring Starts Too Late?

Chicago Fire’s Hanako Greensmith, Jocelyn Hudon Talk Violet/Novak Love Triangle

EsRā of Dunca Sprawling Inc. Releases “IN THE INSANE ASYLUM” — A Haunting, Radical Reinterpretation of Koko Taylor, Now Streaming Worldwide via TSLĀ Records

Billionaire backer sues Trump family's crypto firm over alleged extortion

OpenAI’s GPT-4.5 is better at convincing other AIs to give it money

horsegiirL Announces Debut Album, Shares New Single

Tory Lanez Sues California Prison System Over Inmate Stabbing

Massive Attack’s First Song in Six Years Features Tom Waits

Joy Division, Oasis, and Wu-Tang Clan Inducted Into Rock Hall’s Class of 2026

Latest Posts

Categories

OpenAI’s GPT-4.5 is better at convincing other AIs to give it money

You Might Also Like