GAI
Back to Blog

1. Introduction to AI Agents

AI agents in 2025 with examples

1. Introduction to AI Agents

AI agents are autonomous or human-in-the-loop systems that take human input, process it, take action, and provide results to the end user. Unlike familiar chatbots that provide single-step answers to queries, agents can execute multi-step processes, as I’ll show you through some real-world examples.

ChatGPT Agent (Tasks)

ChatGPT’s agent feature takes a user query, opens a browser, and performs tasks on your behalf. I’ve found it incredibly useful for everything from shopping on Amazon to booking flight tickets or even applying for jobs. What’s particularly interesting is that you can interrupt the agent at any point—making it a true human-in-the-loop system where you maintain control while the AI does the heavy lifting.

Manus AI Agent

Manus is another fascinating general-purpose agent I’ve been experimenting with. It takes user queries and executes them in a Linux VM, delivering the completed task back to you. I’ve used it to generate research reports, create presentations, and even build website code. The power here is that it handles the entire execution environment for you.

Now that we’ve seen some practical examples, let me dive into the AI agent architecture—how do we actually build one of these?

AI Agent Definition and Architecture

Like many developers, I sometimes dream about AI doing all the work while I never have to code again. It’s a tantalizing thought, isn’t it? But this vision has sparked intense debate. Some see it as liberation, while others worry about AI replacing software developers entirely. Turing Award winner and AI pioneer Yoshua Bengio has already warned against the massive replacement of talent. Meanwhile, Andrej Karpathy famously tweeted that “English is the new programming language.”

Amidst all these changes in the software world, the core concept is the AI agent itself. These aren’t like Agent Smith from The Matrix—they’re autonomous programs that can think (reason), plan, and act on a given cue to complete tasks either autonomously or with human oversight.

At the heart of every agent is an LLM (Large Language Model) that provides the know-how for performing tasks. I’ve experimented with different frameworks that give you access to the model and agent capabilities—from single-agent to multi-agent frameworks. My personal experience includes working with LangGraph and CrewAI from DeepLearning.AI.

Multi-Agent Frameworks

CrewAI and LangGraph are multi-agent frameworks that help developers build collaborative agent systems. For example, when I need to create content, I can set up a writing task with three specialized agents: a writer agent drafts the content, an editor agent refines it, and a publisher agent finalizes and distributes it. They coordinate seamlessly to produce the final article.

Then there are evaluations (evals) that help in assessing how well your agent framework performs. This is crucial for iterating and improving your agent’s capabilities.

The foundational layer includes memory and databases. Memory systems like MemGPT give agents the ability to remember context across interactions, while vector databases like Pinecone and Chroma are widely used for storing and retrieving information efficiently.

Depending on what you’re building, you might interact with all these layers or none—you could simply use an agent externally as a tool without worrying about the underlying architecture.

Coding IDE-Based Agents

For us developers, there’s been an explosion of AI-powered IDEs. I’ve tried Cursor, Windsurf, VSCode with Copilot, Replit, and Firebase Studio. Each provides a coding agent that can modify code files, create web apps, and run commands on the command line. Interestingly, most IDEs default to Claude 3.5 Sonnet, except Firebase Studio which uses Gemini. These coding agents are just the beginning—similar AI agents are emerging in every domain you can imagine.

Comments