OpenAI’s new flagship GPT model can control your PC

Remember when AI models could only tell you what to do? Now, the latest LLMs can actually do things with the help of agentic AI software, and OpenAI’s new flagship model is the newest of the bunch.

GPT-5.4 is out now on ChatGPT (where it goes by the name GPT-5.4 Thinking) as well as on the OpenAI API and OpenAI’s coding tool Codex (a version of which just came out for Windows).

This new GPT arrives with a number of new and revamped tricks, starting with its improved spreadsheet skills, more efficient reasoning (meaning it can solve problems using fewer tokens, thus costing you less), and ability to show you an “upfront” plan before executing complex tasks, giving you a chance to steer the model in a new direction before it gets to work.

Most interestingly, GPT-5.4 marks OpenAI’s first general-purpose model that can actually do things on your computer, not just tell you how to do things. For example, GPT-5.4 can click a mouse—or to be more precise, it can issue a “click the mouse” command to an AI agent system on your PC, which does the actual clicking. GPT-5.4 can also edit files on your system, type keyboard commands, and “see” screenshots (allowing it to use a web browser or interact with computer programs).

Now, an important caveat here: GPT-5.4 can only take charge of your PC when it’s operating via the OpenAI API or OpenAI’s Codex tool. When you’re using GPT-5.4 Thinking through ChatGPT—that is, the ChatGPT desktop app or web interface—the LLM is still confined to its chatbox and its various ChatGPT integrations, such as for Google Drive, Spotify, Adobe Photoshop, and others.

It’s also worth noting that while GPT-5.4 is the first general-purpose GPT that can actually use your PC, it’s not the first GPT ever that can do so. There have been Codex-specific GPTs that can execute commands, edit files, and (to an extent) navigate graphical interfaces and weave their way through web workflows. But with its ability to actually browse the web and take charge of PC programs, GPT-5.4 takes the “computer-use” capabilities of earlier Codex-specific models to the next level.

That means you could conceivably ask a GPT-5.4-controlled AI agent on your computer to “balance my books on Quicken” and it would be able to autonomously launch the Quicken app, click its way around the interface, and balance your accounts.

Of course, whether you’d want GPT-5.4 messing around in Quicken on its own is a separate question altogether. For sensitive tasks, you’d likely want to be looking over its shoulder as it works, as you can do while coding with GPT-5.4 in the Codex app.

Still, the “do, don’t just tell” capabilities of GPT-5.4 serve as a perfect example of where we’re headed: AI agent-controlled PCs that are doing things on their own, with high-level direction from us. That said, getting our AI agents to follow our directions correctly will be the real trick.

Source link

OpenAI’s new flagship GPT model can control your PC

Related posts: