OpenAI introduced an AI agent that uses its own browser to perform a range of repetitive tasks such as ordering groceries and making restaurant reservations as part of a move to further integrate the technology into users’ daily lives.
The company stated the agent, called Operator, “can go to the web to perform tasks for you”. It is trained to interact with the buttons, menus and text fields users typically see on the internet, enabling it to act “without requiring custom API integrations”.
Operator is trained to proactively ask users before it takes over functions that require login, payment details, or to solve CAPTCHA screens.
Instead of using an AI chatbot that asks permission to perform functions, the software works independently after users enable it.
Operator is powered by OpenAI’s computer-using vision agent model and its GPT-4o vision capabilities to solve problems by using advanced reasoning and reinforcement learning.
If it encounters problems or makes mistakes, OpenAI stated Operator will use its reasoning capabilities to self-correct. If those measures do not work, it will hand control back to the user.
Operator is currently in “research preview” and only available to US customers that pay to use OpenAI’s ChatGPT Pro service. OpenAI stated it will learn from Operator’s early adopters about how it can improve the service ahead of offering it broadly to more paid subscribers.
It is partnering with companies such as DoorDash, Instacart, OpenTable, Priceline, StubHub, Thumbtack, Uber, and others to “ensure Operator addresses real-world needs while respecting established norms”.
Comments