The Ultimate Guide to Agentic Coding

Ted Werbel
Ted Werbel
AIArchitectureDeveloper ExperienceMonorepoTestingClaude Code

Vibe coding produces technical debt at scale and decouples developers from understanding how their systems actually work. Learn how to structure a monorepo with hexagonal architecture, documentation layers, comprehensive testing, and custom tooling to make AI coding agents genuinely effective.

"Just tell it what you want and it'll build it." That's the promise of vibe coding. Express your intent in natural language, let the AI figure out the rest. Ship fast, iterate later.

Here's what actually happens: poorly structured code, tightly coupled systems, no tests, no documentation, and a mountain of technical debt that compounds with every commit. Six months later, you're staring at a codebase that neither you nor the AI can reason about.

Vibe coding fails because it substitutes intent for understanding. People who don't understand engineering express what they want without expressing how the system should work. The AI fills in the gaps with its best guesses—and those guesses accumulate into architectural chaos.

This article is for two audiences:

  1. Vibe coders who want to level up and actually understand what they're building
  2. Engineering organizations struggling to prepare their codebases for AI coding agents

The thesis is simple: Agentic coding will always beat vibe coding. Understanding your systems, designing clean architecture and investing in documentation and testing isn't just good practice - it's what makes AI agents actually effective.

Here's how to structure a monorepo to make AI coding agents work at their full potential.


The Monorepo Advantage

The first architectural decision that pays dividends: consolidate your code into a single repository.

When AI agents work with your codebase, they need context. With separate repositories, they're constantly jumping between repos, losing track of shared patterns, and making changes that break integrations they can't see. With a monorepo, everything is visible.

Example Structure

monorepo/
├── products/                    # Product-specific code
│   ├── product-a/
│   │   ├── apps/                # Web, API, Lambda, CLI
│   │   ├── packages/            # Core, DB, AI (hexagonal)
│   │   └── docs/                # Product-specific documentation
│   └── product-b/
│       └── ...

├── packages/                    # Shared utilities (cross-product)
│   ├── observability/           # OpenTelemetry instrumentation
│   ├── rate-limit/              # Upstash Redis rate limiting
│   ├── cost-registry/           # AI model cost tracking
│   └── ...

├── apps/                        # Internal tooling
│   ├── team-docs/               # Fumadocs documentation site
│   └── agent-tracer/            # Trace visualization

└── internal-docs/               # Monorepo-wide documentation
    ├── adr/                     # Architecture Decision Records
    ├── ideas/                   # Cross-product brainstorming
    └── bugs/                    # Cross-product bug reports

Products vs. Apps: A Subtle Distinction

Some teams prefer a flat apps/ directory without the products/ layer. That works fine for single-product companies. But by creating a products/ directory, you ensure that any new product your organization creates is self-contained:

  • Its own docs/ directory
  • Its own decoupled packages (core/, db/, ai/)
  • Its own infrastructure configuration
  • Its own test suites

When Product B launches, it doesn't pollute Product A. When an AI agent works on Product A, it has clear boundaries.

The Internal Apps Directory

Notice the apps/ folder at the monorepo root. This is for internal tooling that serves the entire organization:

  • team-docs: A Fumadocs-powered documentation site with human-readable docs. While CLAUDE.md files are for AI agents, team docs are for humans. Every product gets its own section. Shared utilities get their own section. As changes are made, team docs are updated.

  • agent-tracer: A visualization tool for OpenTelemetry traces, helping debug AI agent and system behavior. This is totally optional. I made this purely to make a nicer looking alternative to Grafana that makes it easier to filter spans, visualize metrics and quickly optimize agents - all built on top of Loki, Tempo and Mimir.

These internal apps don't ship to customers - their purpose is to make the team more effective.

Monorepo-Wide Documentation

Just as each product has a docs/ folder, the monorepo root has internal-docs/ with the same structure:

internal-docs/
├── adr/        # Architecture decisions affecting multiple products
├── ideas/      # Ideas for shared packages or cross-product features
├── bugs/       # Bugs in shared infrastructure
├── plans/      # Migration plans affecting the whole monorepo
└── specs/      # Specifications for shared utilities

This mirrors the product-level structure, so AI agents know exactly where to find and store information at any level.


Hexagonal Architecture

Separation of concerns isn't just good engineering - it's essential for AI comprehension. When your business logic, database access and API layer are tangled together - AI agents produce tangled spaghetti code. When they're cleanly separated, agents can modify one layer without breaking others.

The Pattern

                    ┌─────────────────┐
                    │   Application   │
                    │   (Use Cases)   │
                    └────────┬────────┘

         ┌───────────────────┼───────────────────┐
         │                   │                   │
    ┌────▼────┐        ┌─────▼─────┐       ┌─────▼─────┐
    │  Ports  │        │  Domain   │       │  Ports    │
    │  (In)   │        │  (Core)   │       │  (Out)    │
    └────┬────┘        └───────────┘       └─────┬─────┘
         │                                       │
    ┌────▼────┐                            ┌─────▼─────┐
    │Adapters │                            │ Adapters  │
    │ (API)   │                            │ (DB, LLM) │
    └─────────┘                            └───────────┘

Key Principles:

  • Domain (core/): Pure business logic, Zod schemas, no external dependencies
  • Ports: Interfaces defining what the domain needs (repositories, providers)
  • Adapters: Implementations (Drizzle repos, Anthropic provider, Fastify routes)
  • Dependency Injection: Services receive ports, not concrete implementations

Package Structure

Every product follows the same pattern:

products/<product>/packages/
├── core/     # Domain logic, types, schemas (database-agnostic)
├── db/       # Database schema, repository implementations
└── ai/       # AI agents, tools, prompts (if applicable)

Concrete Example: Medication Manager

Here's how a simple medication management app might be structured:

packages/core/src/domains/medications/

// entity.ts - Pure TypeScript types
export interface Medication {
  id: string;
  userId: string;
  name: string;
  dosage: string;
  frequency: 'daily' | 'weekly';
  isActive: boolean;
  createdAt: Date;
}
 
// schema.ts - Zod validation (source of truth)
export const createMedicationInputSchema = z.object({
  name: z.string().min(1).max(100),
  dosage: z.string().min(1).max(50),
  frequency: z.enum(['daily', 'weekly']),
});
 
export type CreateMedicationInput = z.infer<typeof createMedicationInputSchema>;
 
// repository.ts - Port interface (no implementation)
export interface MedicationRepository {
  findById(userId: string, id: string): Promise<Medication | null>;
  create(userId: string, input: CreateMedicationInput): Promise<Medication>;
  list(userId: string): Promise<Medication[]>;
  deactivate(userId: string, id: string): Promise<void>;
}
 
// service.ts - Business logic using the port
export function createMedicationService(repo: MedicationRepository) {
  return {
    async create(userId: string, input: CreateMedicationInput) {
      const validated = createMedicationInputSchema.parse(input);
      return repo.create(userId, validated);
    },
    
    async deactivate(userId: string, id: string) {
      const medication = await repo.findById(userId, id);
      if (!medication) throw new NotFoundError('Medication not found');
      await repo.deactivate(userId, id);
    },
  };
}

packages/db/src/repositories/medication.repository.ts

// Adapter implementing the port
export class DrizzleMedicationRepository implements MedicationRepository {
  constructor(private db: Database) {}
 
  async findById(userId: string, id: string): Promise<Medication | null> {
    const [result] = await this.db
      .select()
      .from(medications)
      .where(and(eq(medications.userId, userId), eq(medications.id, id)));
    return result ?? null;
  }
 
  async create(userId: string, input: CreateMedicationInput): Promise<Medication> {
    const [result] = await this.db
      .insert(medications)
      .values({ userId, ...input })
      .returning();
    return result;
  }
  
  // ... other methods
}

Why This Matters for AI Agents

With hexagonal architecture:

  1. Clear boundaries = smaller, focused files: AI agents understand smaller contexts better

  2. Database logic is isolated: An agent can modify queries in db/ without touching business rules in core/

  3. Swappable implementations: Want to switch from Postgres to MongoDB? Only db/ changes. Want to swap your AI framework? Only ai/ changes. The core/ package doesn't care.

  4. Testing is straightforward: Mock the repository interface in core/ tests; use a real database in db/ tests

  5. Future-proofing: Need different datastores for different capabilities (SQL for transactions, vector DB for search)? Hexagonal makes this trivial.

The architecture absorbs change so your business logic doesn't have to.


Grounding AI with CLAUDE.md Files

AI coding agents need contextual documentation at every level of the codebase. CLAUDE.md files serve as the primary grounding mechanism—structured documentation that tells agents what they need to know about each area of the code.

The Hierarchy

monorepo/
├── CLAUDE.md                           # Monorepo overview, architecture, commands
├── products/
│   └── product-a/
│       ├── CLAUDE.md                   # Product overview, how packages connect
│       ├── packages/
│       │   ├── core/
│       │   │   └── CLAUDE.md           # Domain logic patterns, Zod conventions
│       │   ├── db/
│       │   │   └── CLAUDE.md           # Drizzle patterns, migration rules
│       │   └── ai/
│       │       └── CLAUDE.md           # Agent patterns, tool conventions
│       └── apps/
│           └── api/
│               └── CLAUDE.md           # Route patterns, auth, error handling
└── packages/
    └── observability/
        └── CLAUDE.md                   # Instrumentation patterns

What to Include

Monorepo Root CLAUDE.md:

  • Project overview and architecture
  • Common commands (pnpm install, pnpm build:packages)
  • Monorepo structure explanation
  • Links to key patterns and conventions

Package-Level CLAUDE.md:

  • Package purpose and responsibilities
  • Key patterns used in this package
  • Testing commands specific to this package
  • "Important Rules" (things the AI must never do)

Example: Important Rules

## Important Rules
 
- **Never write raw SQL migrations**. Always use `pnpm db:generate` to create
  migrations from the Drizzle schema. Hand-written SQL will corrupt the 
  migration history.
 
- **Medications cannot be deleted**. They can only be deactivated via the
  `deactivate()` method. Historical dose data must be preserved.
 
- **All repository methods must scope by userId**. Never query without
  filtering by the requesting user's ID.

These rules prevent AI agents from making catastrophic mistakes. They're the guardrails that let you trust the AI to make changes.


Docs Folders: Long-Term Memory for AI Agents

Beyond CLAUDE.md files, every product (and the monorepo itself) should have a dedicated docs/ folder - a form of long-term memory for your coding agent. This also gives your team visibility into key decisions, specifications, bugs and other context around the AI's thinking process.

The Folder Structure

products/<product>/docs/
├── adr/       # Architecture Decision Records
├── analysis/  # Cost projections, token analysis, research
├── bugs/      # Bug reports with root cause analysis
├── design/    # UX analysis, wireframes, design docs
├── ideas/     # Future features, brainstorms, explorations
├── plans/     # Implementation plans, migration strategies
├── prd/       # Product Requirements Documents
├── specs/     # Technical specifications
└── tests/     # Manual test plans for human verification

The monorepo's internal-docs/ folder mirrors this structure for cross-product concerns.

Why Separate Product vs. Monorepo Docs?

Product docs (products/product-a/docs/):

  • Ideas specific to that product
  • Bugs in that product's code
  • Specs for that product's features
  • Design docs for that product's UI

Monorepo docs (internal-docs/):

  • Ideas for shared packages
  • Bugs in shared infrastructure
  • Architecture decisions affecting multiple products
  • Migration plans that touch everything

When an AI agent uses the brainstorming skill, it knows whether to save to product docs or internal docs based on scope.

Mapping Folders to Skills

Each folder has a corresponding skill that knows how to create properly formatted documents:

FolderSkillWhen It's Used
ideas/brainstormingCapture ideas without implementing
bugs/bug-reporterDocument bugs when stuck or ending a session
tests/testing-documentationManual test plans for human verification
design/ux-designerUX analysis, wireframes, JTBD
prd/ux-designerProduct requirements documents
specs/interviewTechnical specs refined through Q&A
analysis/agent-cost-optimizationToken analysis, cost projections

Why This Matters

  1. Extends your cognitive range: Mid-coding and have an idea? Tell the agent: "Use the brainstorming skill to document this idea about voice input." It captures the full context—problem statement, current state analysis, proposed solutions—and you return to focus without losing the thought.

  2. Reduces friction: No mental overhead of "where do I put this?" The structure + skills handle it.

  3. Team visibility: Your team can review what the AI (and you) have been thinking about. Ideas, bugs, specs—all in version control.

  4. Persistent memory: AI agents can reference past decisions, bugs, and ideas. The docs/ folder becomes their long-term memory.


Skills: Teaching AI How to Do Things Right

In traditional organizations, Standard Operating Procedures (SOPs) encode organization-wide knowledge. "Here's how we onboard a customer." "Here's how we handle a security incident." SOPs ensure consistency and quality regardless of who does the work.

Skills are SOPs for AI agents.

They're reusable prompt templates that encode domain expertise and enforce consistency. Instead of explaining the same patterns every time, you teach the agent once, then invoke the skill by name.

The Power of Skills

Without skills, your CLAUDE.md files and global rules would be enormous—trying to cover every scenario. With skills, you can keep your global rules minimal:

## Skills Reference
 
- **api-test-creation**: Creates Fastify API integration tests
- **db-test-creation**: Creates Drizzle ORM repository tests
- **brainstorming**: Documents ideas for future features
- **bug-reporter**: Creates detailed bug reports

A two-sentence description is enough. When you say "use the api-test-creation skill," the agent loads the full skill definition with all its patterns, examples, and conventions.

Example Skills Library

SkillDescription
agent-test-creationCreates YAML-based test scenarios for AI agents. Enables deterministic testing of non-deterministic behavior and catches regressions without manual verification.
agent-cost-optimizationSystematic workflow for analyzing and reducing token consumption. Identifies hotspots in prompts, schemas, and tool results to minimize AI costs.
api-test-creationCreates Fastify API integration tests with consistent patterns. Enforces user isolation and real database testing for reliable endpoint coverage.
db-test-creationCreates Drizzle ORM repository tests using factory patterns. Ensures proper user scoping and test isolation across all database operations.
core-test-creationCreates unit tests for Zod schemas and domain services. Tests validation logic in isolation without database dependencies.
web-test-creationCreates E2E (Playwright) and React component tests. Enforces Page Object Model, semantic locators, and proper test isolation.
node-module-installerInstalls npm packages with vulnerability scanning before installation. Prevents supply chain attacks by auditing dependencies before they enter the codebase.
otel-tracingQueries OpenTelemetry traces from Tempo for debugging. Analyzes agent behavior, token usage, and request flows across services.
brainstormingDocuments ideas for future features with full context. Systematically captures problem statements, current state, proposed solutions, and implementation steps.
bug-reporterCreates detailed bug reports with root cause analysis. Preserves debugging context for structured handoffs when stepping away from a problem.
testing-documentationCreates manual test plans for features requiring human judgment. Covers scenarios like MFA flows, auth, and visual verification that can't be automated.
ux-designerAnalyzes UX using Jobs-to-be-Done, wireframes, and value layers. Provides a systematic approach to designing functional, emotional, and social experiences.
code-reviewGuides the process of receiving feedback and verifying work. Enforces technical rigor and evidence-based completion claims over performative responses.
aestheticCreates beautiful UI following proven design principles. Implements a four-stage approach: Beautiful → Right → Satisfying → Peak.
interviewInterviews developers about specifications through structured Q&A. Deep-dives into implementation details, tradeoffs, and edge cases.

Using Skills in Practice

You: "I just finished implementing the reminder repository. 
     Use the db-test-creation skill to create tests for it."
 
Agent: [Loads skill definition with factory patterns, isolation 
       strategies, and test structure conventions]
       
       [Creates reminder.repository.test.ts with proper setup, 
       CRUD tests, user isolation tests, and edge cases]

The skill ensures the tests follow your team's conventions without you explaining them every time.


Testing Strategy: The Safety Net for AI-Generated Code

A comprehensive testing strategy is what allows you to trust AI agents to modify your code - enabling seamless refactors and new feature development.

The Testing Pyramid

                    ┌─────────────┐
                    │   E2E /     │  ← Few, slow, high confidence
                    │   Smoke     │
                    ├─────────────┤
                    │ Integration │  ← API + DB together
                    ├─────────────┤
                    │    Unit     │  ← Many, fast, isolated
                    └─────────────┘

Test Types and Their Purpose

Test TypeWhat It TestsWhere It LivesSkill
Unit TestsZod schemas, pure functions, domain logicpackages/core/src/**/__tests__/core-test-creation
Repository TestsDatabase queries, ORM logicpackages/db/src/**/__tests__/db-test-creation
API IntegrationHTTP endpoints, request/responseapps/api/src/__tests__/api-test-creation
Component TestsReact components in isolationapps/web/src/**/__tests__/web-test-creation
E2E TestsFull user flows in browserapps/web/e2e/web-test-creation
Agent ScenariosAI agent behavior (tools, responses)apps/agent-qa/scenarios/agent-test-creation
Smoke TestsCritical paths after deploymentTagged scenariosagent-test-creation
Manual PlansFeatures requiring human judgmentdocs/tests/testing-documentation

Why This Matters for AI Agents

  1. Confidence in AI changes: When an agent modifies code, tests tell you if it broke anything

  2. Regression prevention: Agents can run tests before committing

  3. Contract enforcement: Integration tests ensure packages work together

  4. Refactoring safety: With good coverage, hexagonal architecture refactors become safe

  5. Documentation by example: Tests show how code is meant to be used

Testing Philosophy

  • Test at the right level: Don't E2E test what a unit test covers
  • Test real infrastructure: Use real databases in integration tests, not mocks
  • Isolate by user: Each test creates its own user/data to prevent interference
  • Make tests fast: Unit tests should run in seconds, not minutes
  • Test the contract, not the implementation: Focus on inputs/outputs

Agent QA: Deterministic Testing for Non-Deterministic Systems

For AI agents specifically, a YAML-based scenario framework enables deterministic testing of non-deterministic behavior:

id: test-001-task-creation
name: "Create a task via natural language"
steps:
  - chat: "Create a task called 'Buy groceries' for tomorrow"
    tools:
      manageTasks: { min: 1, max: 2 }
    created:
      - entity: tasks
        fields:
          title: { contains: "groceries" }
    response:
      mentionsAny: [created, added, scheduled]
    usage:
      inputTokens: { lt: 50000 }

You can assert on tool calls, entity mutations, response content, and token consumption—all deterministically.

For a deep dive on this approach, see Deterministic Testing for Agentic AI Systems.


The Parallel Agents Workflow

Here's where things get interesting. You can run 4-6 AI coding agents simultaneously - dramatically increasing throughput (if you orchestrate them correctly).

The Secret to Parallel Agents

Not all work is created equal:

  • Some work requires your undivided attention (complex specifications, architectural decisions)
  • Some work is independent and parallelizable (test automation for 3 separate packages)
  • Some work needs periodic check-ins (feature implementation)

The Optimal Configuration

I found that using multiple code agents at the same time can disrupt your focus due to context switching. However, I think what you can do to mitigate the risk of this is to focus more on context consolidation.

So the idea here is that there are three distinct types of tasks that you'll encounter when agentic coding with multiple agents:

  1. Tasks that require trade-off analysis and important decision-making
  2. Tasks where you need to provide minimal guidance to a coding agent
  3. Tasks that are well-defined and can be validated by the agent with zero guidence

Here's a breakdown of six different agents that you can use in parallel to maximize your throughput based on these principles.

AgentRoleYour AttentionSkill Used
1Specification InterviewHIGH - Active dialogueinterview
2Feature Development AMEDIUM - Periodic review
3Feature Development BMEDIUM - Periodic review
4Test AutomationLOW - Review when done*-test-creation
5Test Automation / Code ReviewLOW - Review when donecode-review
6Standby / IdeasLOW - Fire-and-forgetbrainstorming

The Attention Distribution

┌─────────────────────────────────────────────────────────────┐
│                    YOUR ATTENTION                           │
│  ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━   │
│  HIGH ──────────────────────────────────────────────► LOW   │
│                                                             │
│  ┌───────────┐  ┌───────────┐  ┌───────────┐  ┌──────────┐ │
│  │   SPEC    │  │  FEATURE  │  │   QA &    │  │ STANDBY  │ │
│  │ INTERVIEW │  │  DEV x2   │  │  TESTING  │  │  IDEAS   │ │
│  │           │  │           │  │   x2-3    │  │          │ │
│  │  (active  │  │  (round-  │  │  (review  │  │  (quick  │ │
│  │ dialogue) │  │   robin)  │  │ when done)│  │ capture) │ │
│  └───────────┘  └───────────┘  └───────────┘  └──────────┘ │
└─────────────────────────────────────────────────────────────┘

When to Parallelize

Good scenarios:

  • You've built 3 new packages that integrate correctly → Spawn 3 agents for test automation
  • You have a stable feature in review → One agent addresses feedback while another starts the next feature
  • End of session → Multiple agents documenting bugs, ideas, and handoff notes

Bad scenarios:

  • Highly interdependent changes (Agent A's work breaks Agent B's assumptions)
  • Complex architectural decisions that need singular focus
  • When you can't keep up with reviewing their output

The Round-Robin Rhythm

  1. Primary focus: Interview agent refining your specification (requires your responses)
  2. Periodic check-ins: Feature agents—review progress, answer questions, approve directions
  3. Background work: QA agents creating tests—review when complete
  4. Fire-and-forget: Standby agent captures stray ideas so you don't lose them

Real Example: Building a New Feature

Session Start:
├── Agent 1: "Interview me about the reminders feature spec" (interview skill)
├── Agent 2: "Implement the tasks schema in core package"
├── Agent 3: "Implement the tasks repository in db package"
├── Agent 4-5: On standby
 
After Agents 2-3 complete:
├── Agent 1: Still refining spec with you
├── Agent 2: "Now implement the reminder schema in core package"
├── Agent 3: "Now implement the API routes for tasks"
├── Agent 4: "Create db tests for tasks repository"
├── Agent 5: "Create core tests for tasks schema"
 
Mid-session idea:
├── Agent 6: "Use brainstorming skill to document idea for snoozing"
        → Creates IDEA-007-reminder-snoozing.md
        → Returns to standby

Key Principles

  1. One specification agent gets your primary focus: This is where architectural decisions happen
  2. No more than 2 features in parallel: More than that and you lose coherence
  3. QA agents work independently: They reference the code, not your attention
  4. Always have a standby agent: For capturing ideas without breaking flow
  5. Dependencies dictate serialization: If Agent B needs Agent A's output, don't parallelize

Observability and CLI Tools

Access to all observability and custom tooling amplifies what AI coding agents can accomplish by closing the E2E testing loop. Several tools have become essential to this workflow.

agent-qa: Testing AI Agents

Simulates multiple conversations with AI agents with test assertions. YAML scenarios define expected behavior:

# Run a specific scenario
agentqa run scenarios/tasks/suite.yaml --id test-001
 
# Run by tag
agentqa run scenarios/smoke/suite.yaml --tag critical
 
# Save diagnostics for debugging
agentqa run scenarios/suite.yaml --id test-001 --save-diagnostics

Diagnostics include HTTP responses, token usage breakdowns, and OpenTelemetry traces.

traces: Querying OpenTelemetry

When something goes wrong, you need to understand what the AI agent actually did:

# Find traces for a conversation
pnpm traces search --correlation conv_abc123 --fetch
 
# Get recent traces from a service
pnpm traces recent --service pocketcoach-api --since 1h
 
# Get full trace details
pnpm traces get <trace-id>

Token Analysis

Understanding token consumption is critical for cost management:

# Count tokens in a prompt
agentqa tokens "Your system prompt text here"
 
# Analyze Zod schema token costs
agentqa schema-tokens ./src/agents/tasks/types.ts --sort tokens

Output shows which schemas are consuming the most context:

Schema Token Analysis (claude-haiku-4-5)
────────────────────────────────────────────────────────────
Schema Name                         │   Tokens │       Size
────────────────────────────────────────────────────────────
TaskManageSchema                    │      847 │     3.2 KB
TaskInputSchema                     │      523 │     2.1 KB
────────────────────────────────────────────────────────────
Total                               │    1,370 │     5.3 KB

What to Trace

Observability should be built into every layer of the stack using OpenTelemetry trace spans, events and metrics.

  • HTTP requests: Every API call with timing and status
  • Agent execution: Which agent, which tools, which skills
  • LLM calls: Model, tokens, cost, cache hits
  • Tool invocations: Arguments and results
  • Database operations: Queries and timing

Conclusion: Agentic Engineering > Vibe Coding

Let's be clear about the difference.

Vibe coding is letting AI write code while you approve outputs without deep understanding. It's fast, it ships, and it accumulates technical debt with every commit.

Agentic coding is using AI as a force multiplier while maintaining a deep understanding of your systems, architecture and the long-term implications of every decision.

Why Upfront Investment Matters

The temptation is real: ship fast, fix later. But "later" becomes a mountain of:

  • Duplicate code scattered across packages
  • Tightly coupled systems that break when you touch them
  • Technical debt that compounds with interest
  • Major refactors that consume months instead of days

The alternative: Take time in the beginning to:

  • Create shared packages with clear boundaries
  • Build excellent tooling (CLI tools, testing frameworks)
  • Close the observability loop (traces, metrics, logs)
  • Establish persistent memory (docs folders, CLAUDE.md files)
  • Define decoupled architecture (hexagonal, ports & adapters)

When you invest upfront in architecture and tooling:

  1. AI agents operate more effectively: Decoupled code means smaller, focused files. AI understands them better.

  2. Future changes become trivial:

    • Want to swap your agent framework? Only the ai/ package changes.
    • Want to switch databases? Only db/ changes; core/ stays intact.
    • Want to add a new frontend? It consumes the same core/ services.
  3. Testing provides a safety net: With unit, integration, E2E, smoke, and contract tests, refactoring becomes safe. AI can make changes confidently.

  4. Technical debt doesn't accumulate: Good architecture prevents it. Good testing catches it. Good documentation explains why decisions were made.

  5. Your understanding deepens, not atrophies: You're not just approving AI output - you're guiding it with grounded expertise.

The Goal

A well-engineered monorepo isn't just about today's velocity. It's about:

  • Avoiding major refactors in the future
  • Keeping code DRY (Don't Repeat Yourself)
  • Preventing technical debt from poor early decisions
  • Enabling AI agents to work at their full potential

The process may feel tedious in the beginning. But it's infinitely less tedious than:

  • Rewriting your database layer because it's coupled to your business logic
  • Debugging AI-generated code with no tests and no traces
  • Explaining to your future self why this architecture made sense

Final Thoughts

The future of software development is AI-augmented - there's no arguing that anymore. Augmentation, however, only works when there's something solid to augment!

So build a solid foundation:

  • Architecture: Hexagonal, decoupled, testable
  • Documentation: CLAUDE.md files, docs folders, ADRs
  • Testing: Unit, integration, E2E, agent scenarios
  • Tooling: Custom CLIs, observability, token analysis
  • Workflow: Parallel agents with clear roles

Then let AI multiply your impact - not your technical debt. And most importantly - never lose touch of how things work under the hood. If you succomb to the full vibe coding experience and are building anything of moderate complexity, you will 100% find yourself one day lost in a spaghetti code nightmare.

Perhaps this may change one day but I'm still 100% convinced that engineers will always be in the loop. Engineers will not disappear because of AI. Engineers will instead become 10-100x more effective than they are today.

And I firmly believe though that this is all contigent on investments into a solid foundation, workflow and systems thinking mindset.


This is the second article in a series on building AI-ready codebases. The first, Deterministic Testing for Agentic AI Systems, covers testing AI agents with behavioral assertions instead of LLM evaluations.