The Ultimate Guide to Agentic Coding

"Just tell it what you want and it'll build it." That's the promise of vibe coding. Express your intent in natural language, let the AI figure out the rest. Ship fast, iterate later.

Here's what actually happens: poorly structured code, tightly coupled systems, no tests, no documentation, and a mountain of technical debt that compounds with every commit. Six months later, you're staring at a codebase that neither you nor the AI can reason about.

Vibe coding fails because it substitutes intent for understanding. People who don't understand engineering express what they want without expressing how the system should work. The AI fills in the gaps with its best guesses—and those guesses accumulate into architectural chaos.

This article is for two audiences:

Vibe coders who want to level up and actually understand what they're building
Engineering organizations struggling to prepare their codebases for AI coding agents

The thesis is simple: Agentic coding will always beat vibe coding. Understanding your systems, designing clean architecture and investing in documentation and testing isn't just good practice - it's what makes AI agents actually effective.

Here's how to structure a monorepo to make AI coding agents work at their full potential.

The Monorepo Advantage

The first architectural decision that pays dividends: consolidate your code into a single repository.

When AI agents work with your codebase, they need context. With separate repositories, they're constantly jumping between repos, losing track of shared patterns, and making changes that break integrations they can't see. With a monorepo, everything is visible.

Example Structure

monorepo/
├── products/                    # Product-specific code
│   ├── product-a/
│   │   ├── apps/                # Web, API, Lambda, CLI
│   │   ├── packages/            # Core, DB, AI (hexagonal)
│   │   └── docs/                # Product-specific documentation
│   └── product-b/
│       └── ...
│
├── packages/                    # Shared utilities (cross-product)
│   ├── observability/           # OpenTelemetry instrumentation
│   ├── rate-limit/              # Upstash Redis rate limiting
│   ├── cost-registry/           # AI model cost tracking
│   └── ...
│
├── apps/                        # Internal tooling
│   ├── team-docs/               # Fumadocs documentation site
│   └── agent-tracer/            # Trace visualization
│
└── internal-docs/               # Monorepo-wide documentation
    ├── adr/                     # Architecture Decision Records
    ├── ideas/                   # Cross-product brainstorming
    └── bugs/                    # Cross-product bug reports

Products vs. Apps: A Subtle Distinction

Some teams prefer a flat apps/ directory without the products/ layer. That works fine for single-product companies. But by creating a products/ directory, you ensure that any new product your organization creates is self-contained:

Its own docs/ directory
Its own decoupled packages (core/, db/, ai/)
Its own infrastructure configuration
Its own test suites

When Product B launches, it doesn't pollute Product A. When an AI agent works on Product A, it has clear boundaries.

The Internal Apps Directory

Notice the apps/ folder at the monorepo root. This is for internal tooling that serves the entire organization:

team-docs: A Fumadocs-powered documentation site with human-readable docs. While CLAUDE.md files are for AI agents, team docs are for humans. Every product gets its own section. Shared utilities get their own section. As changes are made, team docs are updated.
agent-tracer: A visualization tool for OpenTelemetry traces, helping debug AI agent and system behavior. This is totally optional. I made this purely to make a nicer looking alternative to Grafana that makes it easier to filter spans, visualize metrics and quickly optimize agents - all built on top of Loki, Tempo and Mimir.

These internal apps don't ship to customers - their purpose is to make the team more effective.

Monorepo-Wide Documentation

Just as each product has a docs/ folder, the monorepo root has internal-docs/ with the same structure:

internal-docs/
├── adr/        # Architecture decisions affecting multiple products
├── ideas/      # Ideas for shared packages or cross-product features
├── bugs/       # Bugs in shared infrastructure
├── plans/      # Migration plans affecting the whole monorepo
└── specs/      # Specifications for shared utilities

This mirrors the product-level structure, so AI agents know exactly where to find and store information at any level.

Hexagonal Architecture

Separation of concerns isn't just good engineering - it's essential for AI comprehension. When your business logic, database access and API layer are tangled together - AI agents produce tangled spaghetti code. When they're cleanly separated, agents can modify one layer without breaking others.

The Pattern

                    ┌─────────────────┐
                    │   Application   │
                    │   (Use Cases)   │
                    └────────┬────────┘
                             │
         ┌───────────────────┼───────────────────┐
         │                   │                   │
    ┌────▼────┐        ┌─────▼─────┐       ┌─────▼─────┐
    │  Ports  │        │  Domain   │       │  Ports    │
    │  (In)   │        │  (Core)   │       │  (Out)    │
    └────┬────┘        └───────────┘       └─────┬─────┘
         │                                       │
    ┌────▼────┐                            ┌─────▼─────┐
    │Adapters │                            │ Adapters  │
    │ (API)   │                            │ (DB, LLM) │
    └─────────┘                            └───────────┘

Key Principles:

Domain (core/): Pure business logic, Zod schemas, no external dependencies
Ports: Interfaces defining what the domain needs (repositories, providers)
Adapters: Implementations (Drizzle repos, Anthropic provider, Fastify routes)
Dependency Injection: Services receive ports, not concrete implementations

Package Structure

Every product follows the same pattern:

products/<product>/packages/
├── core/     # Domain logic, types, schemas (database-agnostic)
├── db/       # Database schema, repository implementations
└── ai/       # AI agents, tools, prompts (if applicable)

Concrete Example: Medication Manager

Here's how a simple medication management app might be structured:

packages/core/src/domains/medications/

// entity.ts - Pure TypeScript types
export interface Medication {
  id: string;
  userId: string;
  name: string;
  dosage: string;
  frequency: 'daily' | 'weekly';
  isActive: boolean;
  createdAt: Date;
}
 
// schema.ts - Zod validation (source of truth)
export const createMedicationInputSchema = z.object({
  name: z.string().min(1).max(100),
  dosage: z.string().min(1).max(50),
  frequency: z.enum(['daily', 'weekly']),
});
 
export type CreateMedicationInput = z.infer<typeof createMedicationInputSchema>;
 
// repository.ts - Port interface (no implementation)
export interface MedicationRepository {
  findById(userId: string, id: string): Promise<Medication | null>;
  create(userId: string, input: CreateMedicationInput): Promise<Medication>;
  list(userId: string): Promise<Medication[]>;
  deactivate(userId: string, id: string): Promise<void>;
}
 
// service.ts - Business logic using the port
export function createMedicationService(repo: MedicationRepository) {
  return {
    async create(userId: string, input: CreateMedicationInput) {
      const validated = createMedicationInputSchema.parse(input);
      return repo.create(userId, validated);
    },
    
    async deactivate(userId: string, id: string) {
      const medication = await repo.findById(userId, id);
      if (!medication) throw new NotFoundError('Medication not found');
      await repo.deactivate(userId, id);
    },
  };
}

packages/db/src/repositories/medication.repository.ts

// Adapter implementing the port
export class DrizzleMedicationRepository implements MedicationRepository {
  constructor(private db: Database) {}
 
  async findById(userId: string, id: string): Promise<Medication | null> {
    const [result] = await this.db
      .select()
      .from(medications)
      .where(and(eq(medications.userId, userId), eq(medications.id, id)));
    return result ?? null;
  }
 
  async create(userId: string, input: CreateMedicationInput): Promise<Medication> {
    const [result] = await this.db
      .insert(medications)
      .values({ userId, ...input })
      .returning();
    return result;
  }
  
  // ... other methods
}

Why This Matters for AI Agents

With hexagonal architecture:

Clear boundaries = smaller, focused files: AI agents understand smaller contexts better
Database logic is isolated: An agent can modify queries in db/ without touching business rules in core/
Swappable implementations: Want to switch from Postgres to MongoDB? Only db/ changes. Want to swap your AI framework? Only ai/ changes. The core/ package doesn't care.
Testing is straightforward: Mock the repository interface in core/ tests; use a real database in db/ tests
Future-proofing: Need different datastores for different capabilities (SQL for transactions, vector DB for search)? Hexagonal makes this trivial.

The architecture absorbs change so your business logic doesn't have to.

Grounding AI with CLAUDE.md Files

AI coding agents need contextual documentation at every level of the codebase. CLAUDE.md files serve as the primary grounding mechanism—structured documentation that tells agents what they need to know about each area of the code.

The Hierarchy

monorepo/
├── CLAUDE.md                           # Monorepo overview, architecture, commands
├── products/
│   └── product-a/
│       ├── CLAUDE.md                   # Product overview, how packages connect
│       ├── packages/
│       │   ├── core/
│       │   │   └── CLAUDE.md           # Domain logic patterns, Zod conventions
│       │   ├── db/
│       │   │   └── CLAUDE.md           # Drizzle patterns, migration rules
│       │   └── ai/
│       │       └── CLAUDE.md           # Agent patterns, tool conventions
│       └── apps/
│           └── api/
│               └── CLAUDE.md           # Route patterns, auth, error handling
└── packages/
    └── observability/
        └── CLAUDE.md                   # Instrumentation patterns

What to Include

Monorepo Root CLAUDE.md:

Project overview and architecture
Common commands (pnpm install, pnpm build:packages)
Monorepo structure explanation
Links to key patterns and conventions

Package-Level CLAUDE.md:

Package purpose and responsibilities
Key patterns used in this package
Testing commands specific to this package
"Important Rules" (things the AI must never do)

Example: Important Rules

## Important Rules
 
- **Never write raw SQL migrations**. Always use `pnpm db:generate` to create
  migrations from the Drizzle schema. Hand-written SQL will corrupt the 
  migration history.
 
- **Medications cannot be deleted**. They can only be deactivated via the
  `deactivate()` method. Historical dose data must be preserved.
 
- **All repository methods must scope by userId**. Never query without
  filtering by the requesting user's ID.

These rules prevent AI agents from making catastrophic mistakes. They're the guardrails that let you trust the AI to make changes.

Docs Folders: Long-Term Memory for AI Agents

Beyond CLAUDE.md files, every product (and the monorepo itself) should have a dedicated docs/ folder - a form of long-term memory for your coding agent. This also gives your team visibility into key decisions, specifications, bugs and other context around the AI's thinking process.

The Folder Structure

products/<product>/docs/
├── adr/       # Architecture Decision Records
├── analysis/  # Cost projections, token analysis, research
├── bugs/      # Bug reports with root cause analysis
├── design/    # UX analysis, wireframes, design docs
├── ideas/     # Future features, brainstorms, explorations
├── plans/     # Implementation plans, migration strategies
├── prd/       # Product Requirements Documents
├── specs/     # Technical specifications
└── tests/     # Manual test plans for human verification

The monorepo's internal-docs/ folder mirrors this structure for cross-product concerns.

Why Separate Product vs. Monorepo Docs?

Product docs (products/product-a/docs/):

Ideas specific to that product
Bugs in that product's code
Specs for that product's features
Design docs for that product's UI

Monorepo docs (internal-docs/):

Ideas for shared packages
Bugs in shared infrastructure
Architecture decisions affecting multiple products
Migration plans that touch everything

When an AI agent uses the brainstorming skill, it knows whether to save to product docs or internal docs based on scope.

Mapping Folders to Skills

Each folder has a corresponding skill that knows how to create properly formatted documents:

Folder	Skill	When It's Used
`ideas/`	`brainstorming`	Capture ideas without implementing
`bugs/`	`bug-reporter`	Document bugs when stuck or ending a session
`tests/`	`testing-documentation`	Manual test plans for human verification
`design/`	`ux-designer`	UX analysis, wireframes, JTBD
`prd/`	`ux-designer`	Product requirements documents
`specs/`	`interview`	Technical specs refined through Q&A
`analysis/`	`agent-cost-optimization`	Token analysis, cost projections

Why This Matters

Extends your cognitive range: Mid-coding and have an idea? Tell the agent: "Use the brainstorming skill to document this idea about voice input." It captures the full context—problem statement, current state analysis, proposed solutions—and you return to focus without losing the thought.
Reduces friction: No mental overhead of "where do I put this?" The structure + skills handle it.
Team visibility: Your team can review what the AI (and you) have been thinking about. Ideas, bugs, specs—all in version control.
Persistent memory: AI agents can reference past decisions, bugs, and ideas. The docs/ folder becomes their long-term memory.

Skills: Teaching AI How to Do Things Right

In traditional organizations, Standard Operating Procedures (SOPs) encode organization-wide knowledge. "Here's how we onboard a customer." "Here's how we handle a security incident." SOPs ensure consistency and quality regardless of who does the work.

Skills are SOPs for AI agents.

They're reusable prompt templates that encode domain expertise and enforce consistency. Instead of explaining the same patterns every time, you teach the agent once, then invoke the skill by name.

The Power of Skills

Without skills, your CLAUDE.md files and global rules would be enormous—trying to cover every scenario. With skills, you can keep your global rules minimal:

## Skills Reference
 
- **api-test-creation**: Creates Fastify API integration tests
- **db-test-creation**: Creates Drizzle ORM repository tests
- **brainstorming**: Documents ideas for future features
- **bug-reporter**: Creates detailed bug reports

A two-sentence description is enough. When you say "use the api-test-creation skill," the agent loads the full skill definition with all its patterns, examples, and conventions.

Example Skills Library

Skill	Description
agent-test-creation	Creates YAML-based test scenarios for AI agents. Enables deterministic testing of non-deterministic behavior and catches regressions without manual verification.
agent-cost-optimization	Systematic workflow for analyzing and reducing token consumption. Identifies hotspots in prompts, schemas, and tool results to minimize AI costs.
api-test-creation	Creates Fastify API integration tests with consistent patterns. Enforces user isolation and real database testing for reliable endpoint coverage.
db-test-creation	Creates Drizzle ORM repository tests using factory patterns. Ensures proper user scoping and test isolation across all database operations.
core-test-creation	Creates unit tests for Zod schemas and domain services. Tests validation logic in isolation without database dependencies.
web-test-creation	Creates E2E (Playwright) and React component tests. Enforces Page Object Model, semantic locators, and proper test isolation.
node-module-installer	Installs npm packages with vulnerability scanning before installation. Prevents supply chain attacks by auditing dependencies before they enter the codebase.
otel-tracing	Queries OpenTelemetry traces from Tempo for debugging. Analyzes agent behavior, token usage, and request flows across services.
brainstorming	Documents ideas for future features with full context. Systematically captures problem statements, current state, proposed solutions, and implementation steps.
bug-reporter	Creates detailed bug reports with root cause analysis. Preserves debugging context for structured handoffs when stepping away from a problem.
testing-documentation	Creates manual test plans for features requiring human judgment. Covers scenarios like MFA flows, auth, and visual verification that can't be automated.
ux-designer	Analyzes UX using Jobs-to-be-Done, wireframes, and value layers. Provides a systematic approach to designing functional, emotional, and social experiences.
code-review	Guides the process of receiving feedback and verifying work. Enforces technical rigor and evidence-based completion claims over performative responses.
aesthetic	Creates beautiful UI following proven design principles. Implements a four-stage approach: Beautiful → Right → Satisfying → Peak.
interview	Interviews developers about specifications through structured Q&A. Deep-dives into implementation details, tradeoffs, and edge cases.

Using Skills in Practice

You: "I just finished implementing the reminder repository. 
     Use the db-test-creation skill to create tests for it."
 
Agent: [Loads skill definition with factory patterns, isolation 
       strategies, and test structure conventions]
       
       [Creates reminder.repository.test.ts with proper setup, 
       CRUD tests, user isolation tests, and edge cases]

The skill ensures the tests follow your team's conventions without you explaining them every time.

Testing Strategy: The Safety Net for AI-Generated Code

A comprehensive testing strategy is what allows you to trust AI agents to modify your code - enabling seamless refactors and new feature development.

The Testing Pyramid

                    ┌─────────────┐
                    │   E2E /     │  ← Few, slow, high confidence
                    │   Smoke     │
                    ├─────────────┤
                    │ Integration │  ← API + DB together
                    ├─────────────┤
                    │    Unit     │  ← Many, fast, isolated
                    └─────────────┘

Test Types and Their Purpose

Test Type	What It Tests	Where It Lives	Skill
Unit Tests	Zod schemas, pure functions, domain logic	`packages/core/src/**/__tests__/`	`core-test-creation`
Repository Tests	Database queries, ORM logic	`packages/db/src/**/__tests__/`	`db-test-creation`
API Integration	HTTP endpoints, request/response	`apps/api/src/__tests__/`	`api-test-creation`
Component Tests	React components in isolation	`apps/web/src/**/__tests__/`	`web-test-creation`
E2E Tests	Full user flows in browser	`apps/web/e2e/`	`web-test-creation`
Agent Scenarios	AI agent behavior (tools, responses)	`apps/agent-qa/scenarios/`	`agent-test-creation`
Smoke Tests	Critical paths after deployment	Tagged scenarios	`agent-test-creation`
Manual Plans	Features requiring human judgment	`docs/tests/`	`testing-documentation`

Why This Matters for AI Agents

Confidence in AI changes: When an agent modifies code, tests tell you if it broke anything
Regression prevention: Agents can run tests before committing
Contract enforcement: Integration tests ensure packages work together
Refactoring safety: With good coverage, hexagonal architecture refactors become safe
Documentation by example: Tests show how code is meant to be used

Testing Philosophy

Test at the right level: Don't E2E test what a unit test covers
Test real infrastructure: Use real databases in integration tests, not mocks
Isolate by user: Each test creates its own user/data to prevent interference
Make tests fast: Unit tests should run in seconds, not minutes
Test the contract, not the implementation: Focus on inputs/outputs

Agent QA: Deterministic Testing for Non-Deterministic Systems

For AI agents specifically, a YAML-based scenario framework enables deterministic testing of non-deterministic behavior:

id: test-001-task-creation
name: "Create a task via natural language"
steps:
  - chat: "Create a task called 'Buy groceries' for tomorrow"
    tools:
      manageTasks: { min: 1, max: 2 }
    created:
      - entity: tasks
        fields:
          title: { contains: "groceries" }
    response:
      mentionsAny: [created, added, scheduled]
    usage:
      inputTokens: { lt: 50000 }

You can assert on tool calls, entity mutations, response content, and token consumption—all deterministically.

For a deep dive on this approach, see Deterministic Testing for Agentic AI Systems.

The Parallel Agents Workflow

Here's where things get interesting. You can run 4-6 AI coding agents simultaneously - dramatically increasing throughput (if you orchestrate them correctly).

The Secret to Parallel Agents

Not all work is created equal:

Some work requires your undivided attention (complex specifications, architectural decisions)
Some work is independent and parallelizable (test automation for 3 separate packages)
Some work needs periodic check-ins (feature implementation)

The Optimal Configuration

I found that using multiple code agents at the same time can disrupt your focus due to context switching. However, I think what you can do to mitigate the risk of this is to focus more on context consolidation.

So the idea here is that there are three distinct types of tasks that you'll encounter when agentic coding with multiple agents:

Tasks that require trade-off analysis and important decision-making
Tasks where you need to provide minimal guidance to a coding agent
Tasks that are well-defined and can be validated by the agent with zero guidence

Here's a breakdown of six different agents that you can use in parallel to maximize your throughput based on these principles.

Agent	Role	Your Attention	Skill Used
1	Specification Interview	HIGH - Active dialogue	`interview`
2	Feature Development A	MEDIUM - Periodic review	—
3	Feature Development B	MEDIUM - Periodic review	—
4	Test Automation	LOW - Review when done	`*-test-creation`
5	Test Automation / Code Review	LOW - Review when done	`code-review`
6	Standby / Ideas	LOW - Fire-and-forget	`brainstorming`

The Attention Distribution

┌─────────────────────────────────────────────────────────────┐
│                    YOUR ATTENTION                           │
│  ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━   │
│  HIGH ──────────────────────────────────────────────► LOW   │
│                                                             │
│  ┌───────────┐  ┌───────────┐  ┌───────────┐  ┌──────────┐ │
│  │   SPEC    │  │  FEATURE  │  │   QA &    │  │ STANDBY  │ │
│  │ INTERVIEW │  │  DEV x2   │  │  TESTING  │  │  IDEAS   │ │
│  │           │  │           │  │   x2-3    │  │          │ │
│  │  (active  │  │  (round-  │  │  (review  │  │  (quick  │ │
│  │ dialogue) │  │   robin)  │  │ when done)│  │ capture) │ │
│  └───────────┘  └───────────┘  └───────────┘  └──────────┘ │
└─────────────────────────────────────────────────────────────┘

When to Parallelize

Good scenarios:

You've built 3 new packages that integrate correctly → Spawn 3 agents for test automation
You have a stable feature in review → One agent addresses feedback while another starts the next feature
End of session → Multiple agents documenting bugs, ideas, and handoff notes

Bad scenarios:

Highly interdependent changes (Agent A's work breaks Agent B's assumptions)
Complex architectural decisions that need singular focus
When you can't keep up with reviewing their output

The Round-Robin Rhythm

Primary focus: Interview agent refining your specification (requires your responses)
Periodic check-ins: Feature agents—review progress, answer questions, approve directions
Background work: QA agents creating tests—review when complete
Fire-and-forget: Standby agent captures stray ideas so you don't lose them

Real Example: Building a New Feature

Session Start:
├── Agent 1: "Interview me about the reminders feature spec" (interview skill)
├── Agent 2: "Implement the tasks schema in core package"
├── Agent 3: "Implement the tasks repository in db package"
├── Agent 4-5: On standby
 
After Agents 2-3 complete:
├── Agent 1: Still refining spec with you
├── Agent 2: "Now implement the reminder schema in core package"
├── Agent 3: "Now implement the API routes for tasks"
├── Agent 4: "Create db tests for tasks repository"
├── Agent 5: "Create core tests for tasks schema"
 
Mid-session idea:
├── Agent 6: "Use brainstorming skill to document idea for snoozing"
        → Creates IDEA-007-reminder-snoozing.md
        → Returns to standby

Key Principles

One specification agent gets your primary focus: This is where architectural decisions happen
No more than 2 features in parallel: More than that and you lose coherence
QA agents work independently: They reference the code, not your attention
Always have a standby agent: For capturing ideas without breaking flow
Dependencies dictate serialization: If Agent B needs Agent A's output, don't parallelize

Observability and CLI Tools

Access to all observability and custom tooling amplifies what AI coding agents can accomplish by closing the E2E testing loop. Several tools have become essential to this workflow.

agent-qa: Testing AI Agents

Simulates multiple conversations with AI agents with test assertions. YAML scenarios define expected behavior:

# Run a specific scenario
agentqa run scenarios/tasks/suite.yaml --id test-001
 
# Run by tag
agentqa run scenarios/smoke/suite.yaml --tag critical
 
# Save diagnostics for debugging
agentqa run scenarios/suite.yaml --id test-001 --save-diagnostics

Diagnostics include HTTP responses, token usage breakdowns, and OpenTelemetry traces.

traces: Querying OpenTelemetry

When something goes wrong, you need to understand what the AI agent actually did:

# Find traces for a conversation
pnpm traces search --correlation conv_abc123 --fetch
 
# Get recent traces from a service
pnpm traces recent --service pocketcoach-api --since 1h
 
# Get full trace details
pnpm traces get <trace-id>

Token Analysis

Understanding token consumption is critical for cost management:

# Count tokens in a prompt
agentqa tokens "Your system prompt text here"
 
# Analyze Zod schema token costs
agentqa schema-tokens ./src/agents/tasks/types.ts --sort tokens

Output shows which schemas are consuming the most context:

Schema Token Analysis (claude-haiku-4-5)
────────────────────────────────────────────────────────────
Schema Name                         │   Tokens │       Size
────────────────────────────────────────────────────────────
TaskManageSchema                    │      847 │     3.2 KB
TaskInputSchema                     │      523 │     2.1 KB
────────────────────────────────────────────────────────────
Total                               │    1,370 │     5.3 KB

What to Trace

Observability should be built into every layer of the stack using OpenTelemetry trace spans, events and metrics.

HTTP requests: Every API call with timing and status
Agent execution: Which agent, which tools, which skills
LLM calls: Model, tokens, cost, cache hits
Tool invocations: Arguments and results
Database operations: Queries and timing

Conclusion: Agentic Engineering > Vibe Coding

Let's be clear about the difference.

Vibe coding is letting AI write code while you approve outputs without deep understanding. It's fast, it ships, and it accumulates technical debt with every commit.

Agentic coding is using AI as a force multiplier while maintaining a deep understanding of your systems, architecture and the long-term implications of every decision.

Why Upfront Investment Matters

The temptation is real: ship fast, fix later. But "later" becomes a mountain of:

Duplicate code scattered across packages
Tightly coupled systems that break when you touch them
Technical debt that compounds with interest
Major refactors that consume months instead of days

The alternative: Take time in the beginning to:

Create shared packages with clear boundaries
Build excellent tooling (CLI tools, testing frameworks)
Close the observability loop (traces, metrics, logs)
Establish persistent memory (docs folders, CLAUDE.md files)
Define decoupled architecture (hexagonal, ports & adapters)

When you invest upfront in architecture and tooling:

AI agents operate more effectively: Decoupled code means smaller, focused files. AI understands them better.
Future changes become trivial:
- Want to swap your agent framework? Only the ai/ package changes.
- Want to switch databases? Only db/ changes; core/ stays intact.
- Want to add a new frontend? It consumes the same core/ services.
Testing provides a safety net: With unit, integration, E2E, smoke, and contract tests, refactoring becomes safe. AI can make changes confidently.
Technical debt doesn't accumulate: Good architecture prevents it. Good testing catches it. Good documentation explains why decisions were made.
Your understanding deepens, not atrophies: You're not just approving AI output - you're guiding it with grounded expertise.

The Goal

A well-engineered monorepo isn't just about today's velocity. It's about:

Avoiding major refactors in the future
Keeping code DRY (Don't Repeat Yourself)
Preventing technical debt from poor early decisions
Enabling AI agents to work at their full potential

The process may feel tedious in the beginning. But it's infinitely less tedious than:

Rewriting your database layer because it's coupled to your business logic
Debugging AI-generated code with no tests and no traces
Explaining to your future self why this architecture made sense

Final Thoughts

The future of software development is AI-augmented - there's no arguing that anymore. Augmentation, however, only works when there's something solid to augment!

So build a solid foundation:

Architecture: Hexagonal, decoupled, testable
Documentation: CLAUDE.md files, docs folders, ADRs
Testing: Unit, integration, E2E, agent scenarios
Tooling: Custom CLIs, observability, token analysis
Workflow: Parallel agents with clear roles

Then let AI multiply your impact - not your technical debt. And most importantly - never lose touch of how things work under the hood. If you succomb to the full vibe coding experience and are building anything of moderate complexity, you will 100% find yourself one day lost in a spaghetti code nightmare.

Perhaps this may change one day but I'm still 100% convinced that engineers will always be in the loop. Engineers will not disappear because of AI. Engineers will instead become 10-100x more effective than they are today.

And I firmly believe though that this is all contigent on investments into a solid foundation, workflow and systems thinking mindset.

This is the second article in a series on building AI-ready codebases. The first, Deterministic Testing for Agentic AI Systems, covers testing AI agents with behavioral assertions instead of LLM evaluations.