AI can use your computer now. Should it?

@Vox shared a link

2025-02-13 12:35:48 ·

www.vox.com

The first time I heard about AI agents, I thought they could monitor your computer use, anticipate your needs, and manipulate your behavior accordingly. This wasnt entirely off base. There is a dystopic future about what AI technology could enable that experts issue regular warnings about. Theres also the present reality of agentic AI, which is here and clumsier than you would have guessed.Last month, OpenAI released something called Operator. Its what experts would call an AI agent, meaning a version of AI technology that can not only recall information and generate content, like ChatGPT, but can also actually do things. In the case of Operator, the AI can use a web browser to do anything from buying your groceries to updating your LinkedIn profile. At least in theory. Operator is also currently a research preview thats only available to ChatGPT Pro users, who pay $200 a month for the privilege.The reality is that, in its current form, Operator is not great at doing things. Ive spent a week using it and, if Im being honest, am happy to report that Operator is slow, makes mistakes, and constantly asks for help. Far from the frightening digital bermensch I once feared, what appears to be the state-of-the-art for a consumer-grade AI agent is impressive yet unintimidating. If you ask it to find you a road bike in your size thats on sale and nearby, it can do it. Give it the right amount of context and constraints, and Operator truly works. But if I put in the time myself, I could still find a better bike. Im very optimistic about using AI as sort of a dumb assistant, in that I dont want it to make decisions for me, Aditi Raghunathan, an assistant professor of computer science at Carnegie Mellon University. I dont trust it to do things better than me.The basic concept of an AI agent is simultaneously alluring and horrific. Who wouldnt want an AI to handle mundane computer chores? But if the AI can use a computer to do boring things, you have to imagine it can do scary things, too. For now, for people like you and me, scary things include buying expensive eggs or briefly screwing up your presence on the worlds largest network for professionals. For the economy as a whole, well, it depends on how much we trust AI and how much freedom we give it to operate unchecked.Global leaders gathered for the Paris AI Action Summit this week to discuss the future of the technology. Past summits in Bletchley Park, famous for its code-breaking computer used in World War II, and Seoul focused on AI safety, including the kinds of regulations governments should adopt in order to keep AI in check. But this meeting seemed to highlight a growing sense of competition between global powers, namely the US and China, to win the AI arms race. JD Vance was in attendance and said, The AI future is not going to be won by hand-wringing about safety. So now Im feeling a little nervous. While OpenAIs entry into the AI agent space currently feels like a parlor trick, I have to wonder what the industrys endgame is here. AI could usher in a friendly future of digital assistants who make our lives easier without any negative consequences. Or it could finally realize the paperclip scenario, in which we give AI free rein to solve one problem, like making paperclips, and it diverts all global resources toward that problem, destroying humanity in the process. The future will almost certainly be something in between the best- and worst-case scenarios. In any case, plenty of experts say fully autonomous agents should never be invented. I have to say, if the AI agents of the future are as clumsy as Operator is right now, Im not too worried.AI agents for the rest of usWhether you like it or not, the next wave of AI technology will involve computers using computers. Its already happening. In the big agriculture industry, for example, farmers are already handing over the keys to their John Deere tractors to AI-powered software that can work through the night. Others, like the global development nonprofit Digital Green, are giving farmers in developing countries access to Operator so that it can lower costs and improve crop yields.A farmer can take a picture of a crop, and they can determine the crop is not doing well because of a bug, or it can check the weather to see if its weather-related, said Kevin Barenblat, co-founder and president of Fast Forward, a tech nonprofit accelerator that supports Digital Green. Giving the agent more flexibility to figure out what the problem is really helpful for people when theyre trying to solve problems.Another arresting example of AI agents in action is also a pretty boring one, which tells you something about how this technology can be most useful. Rekki, a startup in London, recently told Bloomberg that it sells access to AI agents that are trained to help restaurants and their suppliers streamline inventory management. A restaurant, for instance, could give the chatbot a long list of ingredients it uses and make sure everything is ordered on time. It works well enough that some companies are cutting staff and paying for the software instead.RelatedOpenAIs new anti-jobs programEnter AI-curious consumers, like me, with problems to solve. If you pay the $200 a month for access, you can gain access to a user-friendly version of Operator that looks and acts a lot like ChatGPT. While it currently works as a separate app on ChatGPTs website, OpenAI ultimately plans to integrate Operator into ChatGPT for a seamless experience. Interacting with Operator is already a lot like using ChatGPT: You get Operator to do tasks by typing prompts into a familiar-looking empty box. Then things get interesting. Operator opens up a tiny browser window and starts doing the task. You can watch it try and fail in real-time.A couple of things Operator successfully did for me: It bought me a new vacuum, and it initiated an exchange for a mattress I bought online. In both cases, however, I essentially did the heavy lifting. Operator cant currently log into websites on your behalf, solve CAPTCHAs, or enter credit card information. So when I was purchasing the vacuum, Operator got as far as finding the product listing, but I pretty much did everything after that. In the customer service example, Operator found the right form, but I filled it out and then the whole transaction moved over to email, where Operator had no jurisdiction.These seemingly innocuous tasks are exactly the kind of thing that OpenAI wants Operator to do right now. It actually serves up suggestions under that prompt box for things like making restaurant reservations, booking plane tickets, and ordering an Uber. If you consider youre not actually handing over your credit card to the AI, getting Operator to do your shopping sounds like a good idea. It will compare prices for you and that part requires little supervision. In one instance, Operator even flagged a potentially fraudulent website selling a Dyson vacuum for $50. But you can also imagine a future in which fraudsters know the AIs weaknesses and exploit them.In its current form, Operator amounts to a painfully slow way to use Google or rather Bing, thanks to OpenAIs partnership with Microsoft. It can do tasks for you while youre doing something else, but like ChatGPT before it, you always have to check Operators work. I asked it to find me the cheapest flights for a weekend visit to my moms house in Tennessee, and it returned a two-week-long itinerary that cost double what Id expect to pay. When I explained the error, Operator did it again but worse.Operator is, in many ways, a mirage. It looks like a proof-of-concept that AI can not just generate text and images but actually perform tasks autonomously, making your life effortless in the process. But the more you ask the agent to do, the more agency it requires. This is a big conundrum for the future of AI development. When you put guardrails on tools not letting Operator go wild with your credit card, for instance you constrain its utility. If you give it more power to make decisions and operate independently, it may be more useful but also more dangerous.Which brings us back to the paperclip problem. First popularized by philosopher Nick Bostrom in 2003, the paperclip scenario imagines giving a superintelligent AI the task of manufacturing paperclips, and the freedom to do so unchecked. It doesnt end well for humans, which is a stark reminder that responsible AI development is not just about preventing an AI from using your credit card without permission. The stakes are much higher.One of the most high-risk scenarios would be AI agents deployed to accelerate biological weapons development, said Sarah Kreps, director of the Tech Policy Institute at Cornell University. A committed, nefarious actor could already develop bioweapons, but AI lowers the barriers and removes the need for technical expertise.This sort of thing is what global leaders were discussing in Paris this week. The consensus from the AI Summit, however, was not encouraging, if you care about the future of the human race. Vice President Vance called for unparalleled R&D investments into AI and called for international regulatory regimes that fosters the creation of AI technology rather than strangles it. This reflects the same anti-guardrail principles that were in the executive order President Trump signed in January revoking President Joe Bidens plan for safe and responsible AI development. For the Trump administration, at least, the goal for AI development seems to be growth and dominance at all costs. But its not clear that the companies developing this technology, including OpenAI, feel the same way. Many of the limitations I found in Operator, for instance, were imposed by its creators. The AI agents slow-moving, second-guessing nature made it less useful but also more approachable and safe.Operator is very clearly an experiment. Its telling that OpenAI rolled it out for ChatGPT Pro subscribers, who are clearly enthusiastic enough and bullish enough about AI that theyre willing to spend a four-figure sum annually to access the latest features. Based on their feedback, OpenAI will undoubtedly release a tweaked and improved version and then iterate again. In a couple of years, when the kinks are worked out, maybe well know how scared we should be about a future powered by AI agents.A version of this story was also published in the Vox Technology newsletter. Sign up here so you dont miss the next one!See More:

0 Comments ·0 Shares ·66 Views

Upgrade to Pro