Enhancing Digital Interactions with AI Agents for Greater Convenience and Efficiency
Abstract
AI agents are software entities capable of performing digital tasks autonomously, adapting to new challenges and
environments without direct human control. Recently, AI agents powered by large language models like GPT-4 have been
integrated into operating systems such as Ubuntu, Windows, and macOS. These agents can install apps, edit images,
remove tracking devices from websites, and automate routine digital tasks. OSWorld is a platform that supports the
research of these agents. In this project, we reproduced the OSWorld environment and analyzed the agents’ ability to
complete digital tasks using GPT-4o. The research results showed that AI agents can be used in a wide range of digital
applications, offering significant practical benefits and assistance to users in various settings.
Published
Issue
Section
License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.