Amazon announced its Nova Act on Monday. NovaAct is a generic AI agent that allows you to control your web browser and perform simple actions independently. In addition to the new Agent AI model, Amazon is releasing the NOVA Act SDK, a toolkit that allows developers to build agent prototypes with NOVA ACT.
Developed by Amazon’s recently opened San Francisco-based AGI Lab, the Nova Act also enhances key features of the company’s upcoming Alexa+ Upgrade, a generative AI-enhanced version of Amazon’s popular voice assistant. However, the version of the Nova Act, available from today, is a bit less refined. Amazon calls it a research preview.
Developers can access the Nova Act Toolkit on Nova.Amazon.com, a new website that also serves as a showcase for various Nova Foundation models on Amazon.
The Nova Act is an attempt by Amazon to use Openai operators and human computers. Several large tech companies believe that AI agents that can navigate the web for users will make today’s AI chatbots significantly more useful.
Amazon may not be the first to develop this kind of agent technology, but it may have the widest reach through Alexa+.
According to Amazon, developers should be able to build on the NOVA ACT SDK and automate basic actions on your behalf, such as ordering salads from SweetGreen or making dinner reservations. NOVA ACT Toolkit allows developers to organize tools that allow AI agents to navigate web pages, fill out forms, and select dates in the calendar.
Amazon claims that the Nova Act is superior to Openai and the agents of humanity in internal testing of several companies. For example, in Screenspot Web Text, which measures how AI agents interact with on-screen text, the Nova Act scored 94%, surpassing Openai’s CUA (who won 88%) and Anthropic’s Claude 3.7 Sonnet (90%).
However, Amazon did not benchmark the Nova Act using more common agent ratings such as WebVoyager.
The Nova Act is the first public product to emerge from the aforementioned AGI lab on Amazon, an initiative co-led by former Openai researchers David Luan and Pieter Abbeel. The original, previously established startup – Luann started Adept, but Abu Beer co-founded co-variance – Amazon hired them last year to lead the AI agent efforts.
It may seem strange that AGI Labs are building AI agents that can order salads, but Luan told TechCrunch that he sees agents as an important step towards creating a super-intelligent AI system. Luan defines AGI as “an AI system that helps humans do anything they do on a computer.”
Luan says his team will design the Nova Act SDK to ensure that short, simple tasks are automated and provide tools to help developers define exactly when they want to human intervene in their agent workflows. He hopes that although it is not necessarily a completely autonomous application, it will allow developers to create more reliable agent applications.
Amazon is releasing its first generalist AI agent in a busy space, but that’s a key technology that the company has a lot of riding. Early testing of the Nova Act gives you a glimpse into some of the long-standing Alexa+ abilities, which are make-up or break-off moments for Amazon’s AI efforts.
The main issue with Openai, Google, and early AI agents in humanity is reliability across different domains. In TechCrunch testing, the system is slow, has a hard time working independently for a very long time, and tends to make mistakes that humans don’t do. It won’t take long until you see if Amazon has cracked the code or if its agent is suffering from the same flaws that plague its competitors.
Source link