AI & Machine Learning

Build Your Own AI Agent Fleet: A Step-by-Step Guide to Shipping Faster with Virtual Teams

2026-05-02 18:30:28

Introduction

Imagine a team of seven AI assistants working around the clock to test your product, triage bugs, write release notes, and even fix issues – all without you lifting a finger. That’s exactly what the Coding Agent Sandboxes team at Docker achieved with their “Fleet” of virtual agents. By leveraging Claude Code skills and secure sandbox isolation, they transformed a set of static scripts into a dynamic, autonomous workforce that accelerates shipping. In this guide, you’ll learn how to build a similar fleet, step by step, from defining agent roles to running them seamlessly in CI. Whether you’re a solo developer or part of a larger team, this approach can help you ship faster and reduce manual overhead.

Build Your Own AI Agent Fleet: A Step-by-Step Guide to Shipping Faster with Virtual Teams
Source: www.docker.com

What You Need

Step 1: Define Your Agent Roles and Responsibilities

Before writing any code, identify the manual tasks that slow down your shipping. Common candidates include exploratory testing, regression checks, triaging issues, writing release notes, and fixing repetitive bugs. For each task, define a distinct role (e.g., “CLI Tester,” “Build Engineer,” “Release Manager”). Give each role a clear set of responsibilities and boundaries. For example, a CLI tester should focus on exercising commands and reporting failures, while a build engineer might manage version upgrades and performance benchmarks. This clarity will guide the skill file you create in the next step.

Jump to Step 2

Step 2: Create Skill Files as Role Descriptions

A skill file is a markdown document that describes an agent’s persona, what it knows, and how it makes decisions. Think of it as a role description, not a script. For instance, a build engineer skill might say: “You are an expert in Docker builds. Your main task is to compile the CLI tool across three platforms (MacOS, Linux, Windows) and flag any compilation errors. You have access to a sandbox with internet and the source repository. When a build fails, investigate the error message and propose a fix.”

Structure your skill file with clear sections: Persona, Responsibilities, Tools & Permissions, Decision Rules. Use natural language that the AI can interpret. The key is to enable judgment – if a test fails unexpectedly, the agent should investigate, not stop. Save each file with a descriptive name like /cli-tester-skill.md. Ensure the same skill behaves identically whether run on your laptop or in CI.

Now let’s test it locally

Step 3: Test Skills Locally First

Do not wire your skill directly into a CI workflow. Instead, run it on your development machine using the same environment (sandbox + AI agent) that CI will use. Invoke the skill manually: watch the agent think, see where it gets confused, and listen to its decisions. This fast feedback loop saves hours of debugging later. For example, if your CLI tester skill builds the binary and runs commands, observe whether it correctly identifies a broken flag or misinterprets an error. Tweak the skill file, re-invoke, and repeat until it performs as desired. Only after local success should you consider CI integration.

Build Your Own AI Agent Fleet: A Step-by-Step Guide to Shipping Faster with Virtual Teams
Source: www.docker.com
Continue to Step 4

Step 4: Wire the Skill into CI Without Modification

Your skill file is now validated locally. To run it in CI, create a workflow (e.g., GitHub Actions) that sets up the sandbox environment, checks out the repository, and calls the exact same skill file. Do not create a separate “CI version.” The workflow should only handle environment variables, secrets, and scheduling – the agent’s logic remains untouched. For example, a nightly workflow can trigger the CLI tester on MacOS, Linux, and Windows runners simultaneously. The same skill file that worked on your laptop now runs autonomously in production, providing consistent results.

Move to Step 5 for iteration

Step 5: Iterate and Expand Your Fleet

Once your first agent is running in CI, monitor its reports and performance. Use the insights to refine the skill file – add new responsibilities, adjust decision rules, or improve prompting. Then, repeat the cycle for other roles: create a skill, test locally, add to CI. Docker’s fleet, for instance, grew to seven roles covering testing, triage, release notes, and bug fixing. Over time, you can schedule agents to run on different triggers (pull requests, nightly, weekly) and even let them collaborate (e.g., a tester agent files an issue, a triage agent reads it and auto-assigns). Keep each skill focused and independent to avoid conflicts.

Check the Tips section

Tips for Success

Building an AI agent fleet is not about replacing your team – it’s about augmenting it. By following these steps, you can automate repetitive tasks, reduce manual toil, and ship more confidently. The Docker team proved that a virtual squad of agents can dramatically speed up development, and now you have the blueprint to do the same. Happy coding!

Explore

Meta Advances End-to-End Encrypted Backups with HSM Vault Upgrades Professional Sports Unions Urge CFTC to Ban 'Under' Bets on Player Performance, Citing Harassment Risks Perplexity's Mac-First 'Personal Computer' Platform: Your Top Questions Answered Naval Security Breach: How a Hidden Bluetooth Tracker in a Postcard Exposed Fleet Movements 6 Reasons Why America's Fertility Panic Misses the Real Issue