Vercel eve: an open framework for building AI agents

Jun 20, 2026

A look at eve, Vercel's open-source agent framework. An agent is a directory of files, with durable execution, sandboxes, approvals, and evals built in.

~~~

Vercel just shipped a new open-source framework, and like a lot of things lately, it’s not for websites. It’s for AI agents.

It’s called eve, lowercase, and the pitch is one line: it’s like Next.js, but for agents.

If you’ve tried to build an agent and take it past a demo, you know the pain. The model part is easy. It’s everything around it, durability, sandboxing, approvals, logging, that you end up hand-rolling every single time. eve’s whole idea is that you shouldn’t have to.

An agent is a directory

Here’s the core idea, and it’s a good one. In eve, an agent is just a folder of files. Each file is one piece of the agent.

agent/
  agent.ts             # the model it runs on
  instructions.md      # who it is
  tools/
    run_sql.ts         # what it can do
  skills/
    revenue.md         # what it knows
  subagents/
    investigator/      # who it delegates to
  channels/
    slack.ts           # where it lives
  schedules/
    monday.ts          # when it acts on its own

Look at that tree and you already know what the agent is, what it can do, where it lives, and when it runs by itself. No config to read, no wiring to trace.

If you’ve used Next.js, this will feel familiar. Next.js turns a file into a route by owning the routing. eve turns a file into an ability by owning the agent loop. You add a file, eve picks it up.

Let’s walk through the pieces.

Who the agent is

Two files define the heart of an agent.

instructions.md is the system prompt, in plain Markdown. This is the agent’s job and personality:

You are a senior data analyst. You answer questions about the team's data.

- Prefer exact numbers to hand-waving. If you can compute it, compute it.
- State the assumptions behind any number you report.
- Use the tools available rather than guessing.

agent.ts is where you pick the model and configure the runtime:

import { defineAgent } from 'eve'

export default defineAgent({
  model: 'anthropic/claude-opus-4.8',
})

That’s a working agent. An instructions.md on its own is enough to run something. Everything else is adding capabilities.

What it can do: tools

A tool is an action the agent can take. In eve, a tool is one TypeScript file in tools/. The filename becomes the tool name, and there’s nothing to register:

import { defineTool } from 'eve/tools'
import { z } from 'zod'

export default defineTool({
  description: 'Get the weather for a city',
  inputSchema: z.object({
    city: z.string(),
  }),
  async execute({ city }) {
    const res = await fetch(`https://api.weather.com/current?city=${city}`)
    return res.json()
  },
})

The Zod schema describes the input, so the model knows exactly how to call it. You write the function, eve hands it to the model.

What it knows: skills

A skill is a Markdown playbook the agent loads only when it’s relevant. This keeps the prompt focused, instead of stuffing every rule into every call.

---
description: How this team defines revenue. Load before any revenue question.
---

Revenue is recognized net of refunds, over the subscription term.
Weeks are Monday-anchored, in UTC.
Exclude trial and internal accounts from every number.

The description is the trigger. When a task looks like it needs this knowledge, eve loads it. The rest of the time it stays out of the way.

The rest of the shape

The same “one file, one capability” pattern covers everything else:

subagents/ — a folder for a child agent with its own instructions and tools. The parent delegates to it like calling a tool.
channels/ — where the agent shows up. One small file each for Slack, Discord, Teams, and more. The same agent serves all of them.
schedules/ — a cron expression and a handler, so the agent runs on its own clock.
connections/ — auth to outside services like GitHub, Linear, or Stripe, so tools can call them without you juggling tokens.

You don’t learn a new concept for each one. It’s the same idea applied in a different folder.

What you get for free

This is the part that makes eve more than a nice file layout. The hard production stuff ships with the framework.

Durable execution. Every conversation is a durable workflow. Each step is checkpointed, so a session can pause, survive a crash or a deploy, and pick up exactly where it stopped. Agents that wait hours for a reply or a slow system don’t fall over.

A sandbox for every agent. Agent-written code is treated as untrusted, so it never runs in your app. Each agent gets its own isolated environment for shell commands and files. In production it runs on Vercel Sandbox; locally it can run on Docker.

Human-in-the-loop approvals. Some actions shouldn’t happen without a person saying yes. Mark a tool as needing approval, and the agent pauses there and waits, without burning any compute, until someone approves. Then it continues.

export default defineTool({
  description: 'Run a SQL query against the warehouse',
  inputSchema: z.object({ sql: z.string() }),
  needsApproval: ({ toolInput }) => estimateScanGb(toolInput.sql) > 50,
  async execute({ sql }) {
    // ...
  },
})

Tracing and evals. Every run produces a trace: each model call and tool call in order, with inputs and outputs. The traces are standard OpenTelemetry, so they export to whatever you already use. And evals let you test an agent like any other code, with scored checks you run locally or in CI.

None of this is something you bolt on later. It’s there from the first file.

The dev loop

Running an eve agent is one command:

eve dev

You get a terminal UI where you can talk to the agent and watch what it does: which skill it loaded, which tool it ran, what it answered. No more squinting at stdout.

Testing is just as direct. You write evals as files and run them:

eve eval

And shipping is the part Vercel obviously nailed. An eve agent is an ordinary Vercel project, so it deploys like anything else:

vercel deploy

The same directory that ran on your laptop runs in production. The sandbox swaps to Vercel Sandbox with no code change.

It runs on Vercel’s primitives

eve isn’t built from nothing. It stitches together the AI pieces Vercel already had: AI Gateway for model calls (with provider fallbacks), Vercel Sandbox for isolated execution, Vercel Workflows for the durable sessions, and Vercel Connect for auth.

That’s the real advantage of a framework like this. All the infrastructure that you’d normally glue together yourself is already wired up.

My take

I like this a lot, and not only because of eve itself.

There’s a clear pattern forming across the ecosystem: an agent is a directory of files. Markdown for instructions and skills, code for tools, folders for channels and schedules. I wrote about Flue the other day, from the Astro folks, and the shape is strikingly similar.

When different teams independently land on the same structure, it usually means the abstraction is real. Agents have a shape now, and frameworks are starting to capture it, the same way Next.js captured the shape of a web app.

The thing eve brings is that all the production hard parts, durability, sandboxing, approvals, evals, come in the box, on infrastructure Vercel already runs at scale.

If you’ve built an agent and felt the pain of everything around the model, eve is worth a look.

Try it

The public preview is open, and the CLI walks you through your first agent in about a minute:

npx eve@latest init my-agent

It’s open source at github.com/vercel/eve, with docs at eve.dev/docs.