OpenAI has unveiled Operator, a generative AI service that performs tasks for the user. With its own browser, Operator scans webpages and interacts with it by typing, clicking, and scrolling by itself.

In detail, it can work through a handful of tedious browser-related tasks. The service is capable of using and understanding the same interfaces or tools that humans work with.
In theory, this could be a game changer for businesses worldwide. OpenAI says that it can fill out forms, order for the user, or even make memes.
Operator is powered by OpenAI’s latest model called CUA (Computer Using Agent). This combines GPT-4o’s visual capabilities with advanced reasoning through reinforced machine learning. CUA is conditioned to interact with GUIs (graphical user interfaces) with buttons, menus, and text fields on a user’s screen.
Should the service become stuck on a task, it will hand controls back to the user. Notably, users need to manually input sensitive data. This includes personal information, passwords, or verification forms.
Operator will be rolled out gradually, set to be available for ChatGPT Pro subscribers in the United States. For those interested in more details, check out OpenAI’s video posted on X.

0 Comments
Leave a Reply