Stop Letting AI Agents Wander Blind: OpenCode Planning Toolkit Exposed

Your AI agent just spent 3 hours refactoring code that didn't need refactoring. Meanwhile, the critical authentication bug sits untouched. Sound familiar?

Here's the brutal truth most developers won't admit: unstructured AI agents are expensive chaos machines. They hallucinate priorities, forget project standards mid-session, and leave behind a trail of half-finished "improvements" that nobody asked for. Every new chat session starts from zero context. Every agent operates in its own vacuum. The result? Technical debt that compounds faster than your coffee bill.

But what if your AI agents could remember? What if they could share a collective brain across sessions, follow standardized plans, and actually finish what they start?

Enter the OpenCode Planning Toolkit — a deceptively simple plugin that transforms scattered AI interactions into a disciplined, project-wide orchestration system. Created by Igor Warzocha, this isn't just another todo list hack. It's a fundamental rearchitecture of how AI agents participate in software development.

In this deep dive, I'll expose exactly how this toolkit works, why top developers are quietly adopting it, and how you can deploy it in under 5 minutes. The chaos ends now.

What Is OpenCode Planning Toolkit?

The OpenCode Planning Toolkit is a plugin for OpenCode — the AI-native code editor — that introduces structured planning primitives directly into your development environment. At its core, it solves one deceptively simple problem: AI agents have no persistent memory of what they're supposed to be doing.

Igor Warzocha built this toolkit after observing a pattern that plagues every AI-assisted workflow: agents start strong, then gradually drift. Without anchors — without specs, plans, and progress tracking — even the most capable models eventually lose the plot. The toolkit's genius lies in its markdown-native architecture. Rather than inventing a new format or locking you into proprietary systems, it uses simple .md files stored in your repository. This means:

Full version control — specs and plans live in git history
Universal readability — any developer can open and understand them
Zero lock-in — your planning data survives any tool migration

The toolkit is trending now because the AI development landscape has reached an inflection point. Early adopters experimented with raw prompts; now, teams scaling beyond solo projects desperately need governance mechanisms. The Planning Toolkit arrives exactly when the market screams for structure.

What makes it particularly powerful is its bundled skill system — a meta-instruction set that automatically trains agents on proper planning workflow. You don't manually prompt "please check existing plans first." The skill embeds that behavior. This is the difference between telling a junior developer to "be careful" and establishing code review processes that enforce quality.

Key Features That Separate Amateurs From Pros

Let's dissect what makes this toolkit genuinely transformative versus the endless sea of productivity gimmicks.

Persistent Cross-Session Memory

The toolkit maintains a repo-wide todo list that survives chat restarts, browser refreshes, and even machine reboots. Your docs/plans/ and docs/specs/ directories become the single source of truth. When you reconnect, agents immediately receive <available_plans> in their system prompt — no context re-establishment required.

Dual-Primitive Architecture: Specs + Plans

Specs are reusable standards (typescript-standards.md, api-design-principles.md). Plans are actionable work items with implementation steps. This separation mirrors how senior engineers actually think: abstract principles guide concrete execution. The toolkit enforces this mental model programmatically.

Automatic Spec Injection

When agents read plans via read_plan, linked specs expand inline. This means an agent implementing authentication doesn't just see "follow TypeScript standards" — it sees the actual standards in context. No tab-switching. No "oh right, I forgot about that rule."

Bundled Skill: Agent Behavior Modification

The plans-and-specs skill is where the magic happens. It provides:

Workflow sequencing: Create plan → append REPO specs → inquire about FEATURE specs
Deduplication guards: Check existing plans before creating new ones
Completion discipline: Proper mark-done protocols

This isn't documentation — it's behavioral firmware for your agents.

Status-Driven Progress Tracking

Binary active/done status eliminates ambiguity. No "in progress" limbo. No "almost done" fiction. Plans are either live or complete, forcing decisive action.

Use Cases: Where This Toolkit Absolutely Dominates

1. Multi-Agent Team Coordination

Running multiple AI agents on the same codebase? Without coordination, they step on each other's work constantly. The Planning Toolkit acts as a shared bulletin board. Agent A creates the auth plan; Agent B sees it's active and picks up the payment integration instead. No human project manager required.

2. Long-Running Feature Development

That refactoring project spanning 47 chat sessions? Previously, you'd reconstruct context every time. Now, the plan persists. Each session picks up exactly where the last left off. The read_plan tool ensures new agents inherit full context instantly.

3. Standards Enforcement at Scale

Growing teams (or agent fleets) struggle with consistency. A repo-scoped spec for error handling propagates automatically to every new plan. When the standard evolves, update one file — every linked plan reflects the change on next read.

4. Audit-Ready Development Trails

For regulated industries, the markdown trail provides automatic documentation. Every plan creation, spec linkage, and completion mark is timestamped in git. Compliance auditors love this. Future-you loves this when debugging "why did we do it this way?"

5. Onboarding Acceleration

New team member or fresh agent instance? Point them at docs/plans/. They absorb project state faster than any handoff meeting. The expanded specs in read_plan output function as self-documenting requirements.

Step-by-Step Installation & Setup Guide

Ready to stop the chaos? Here's your exact deployment path.

Prerequisites

OpenCode installed and configured
A project repository initialized with git
Node.js environment for plugin loading

Installation

Add the plugin to your opencode.json configuration:

{
  "plugins": [
    "@howaboua/opencode-planning-toolkit@latest"
  ]
}

For local development or customization, reference the file directly:

{
  "plugins": [
    "file:///path/to/opencode-planning-toolkit/index.ts"
  ]
}

Critical: After saving opencode.json, restart OpenCode completely. Plugin loading requires a fresh initialization cycle.

Directory Structure Initialization

The toolkit auto-creates directories on first use, but proactive setup prevents errors:

mkdir -p docs/specs docs/plans

Add to .gitignore if you want plans excluded (not recommended for team use):

# Only if keeping plans local — generally DON'T do this
docs/plans/*.md

Verification

Trigger the bundled skill by asking your agent:

"Create a spec for our project's coding standards"

If the agent responds with structured markdown in docs/specs/, you're operational. The <available_plans> injection in system prompts happens automatically — no configuration needed.

Environment-Specific Notes

Monorepos: Consider packages/*/docs/ subdirectories; the toolkit respects relative paths
CI/CD integration: Parse docs/plans/*.md in build scripts to block deployment with active plans
Backup strategy: Since everything's markdown, standard git workflows suffice

REAL Code Examples From the Repository

Let's examine actual toolkit usage with the exact patterns from Igor Warzocha's implementation.

Example 1: Creating a Reusable Specification

The foundation of disciplined development is standards that outlive any single session. Here's how the toolkit structures specs:

# Spec: typescript-standards

Scope: repo

- Use strict mode
- All functions must have explicit return types
- Prefer `const` over `let`
- No `any` types without justification

What's happening here:

The Scope: repo declaration is load-bearing. It signals to the bundled skill that this spec should auto-attach to every new plan. The markdown format ensures any developer — human or AI — parses this without special tooling. Notice the absence of complex metadata; simplicity is the durability strategy.

Practical implementation pattern: Create specs during project initialization, before any plans exist. This establishes the rulebook before gameplay begins. The agent receives this through <available_plans> injection and applies it without explicit prompting.

Example 2: Creating a Structured Plan

Plans transform vague intentions into executable sequences. The toolkit enforces minimum structure through its bundled skill:

---
plan name: user-auth
plan description: JWT authentication for API
plan status: active
---

## Idea
Add secure JWT authentication to the API with login, logout, and token refresh.

## Implementation
- Design JWT token structure and expiry policy
- Add /auth/login endpoint with password validation
- Add /auth/refresh endpoint for token renewal
- Add /auth/logout endpoint to invalidate tokens
- Write tests for all auth endpoints

## Required Specs
<!-- SPECS_START -->
<!-- SPECS_END -->

Critical implementation details:

The YAML frontmatter (--- delimited) carries machine-parseable metadata while remaining human-readable. The plan status: active field drives workflow state machines — agents check this before claiming work. The  /  comments are injection points; when append_spec runs, the linked spec name appears between these markers.

The minimum 5 implementation steps requirement (enforced by bundled skill) prevents under-specified plans. This catches the classic anti-pattern: "Implement authentication" as a single bullet point.

Example 3: Linking Specs to Plans

This is where the toolkit's intelligence shines — automatic standard propagation:

## Required Specs
<!-- SPECS_START -->
- typescript-standards
<!-- SPECS_END -->

The operational sequence:

Agent creates plan → bundled skill auto-links all repo-scoped specs
Agent queries: "Should I create or link any feature-specific specs?"
Human or agent responds with spec names
append_spec tool injects references between the marker comments

When read_plan executes, the toolkit expands typescript-standards inline. The agent doesn't just know the spec exists — it receives the actual content in context. This eliminates the "I forgot to check the standards doc" failure mode.

Example 4: Reading Plans with Full Context Expansion

The read_plan tool's output is where the magic becomes visible. The agent receives:

# Plan: user-auth (status: active)

## Idea
Add secure JWT authentication to the API with login, logout, and token refresh.

## Implementation
[...steps...]

## Required Specs (expanded)

### typescript-standards
- Use strict mode
- All functions must have explicit return types
- Prefer `const` over `let`
- No `any` types without justification

Why this matters: Context window usage is optimized. The agent doesn't need to read_file on multiple specs — the plan tool composes everything. This reduces token consumption and eliminates the "which files do I need?" decision fatigue that derails agents.

Advanced Usage & Best Practices

Spec Granularity Strategy

Repo specs for universal rules (formatting, testing requirements). Feature specs for domain-specific constraints (auth flows, payment compliance). Resist the urge to over-specify; aim for 3-7 repo specs maximum. Too many standards paralyze rather than guide.

Plan Lifecycle Management

Implement a weekly plan review ritual. Archive done plans to docs/plans/archive/ or delete if truly complete. Active plan accumulation creates cognitive overhead that defeats the toolkit's purpose.

CI/CD Integration Hook

Add a pre-commit check:

#!/bin/bash
# Block commits with active plans touching modified files
for plan in docs/plans/*.md; do
  if grep -q "plan status: active" "$plan"; then
    echo "WARNING: Active plan found — $plan"
  fi
done

This surfaces planning state during code review without being draconian.

Multi-Project Spec Inheritance

For organizations with multiple repositories, maintain a common-standards repo. Clone specs via submodule or copy script. The markdown format makes this trivial compared to proprietary planning tools.

Comparison With Alternatives

Feature	OpenCode Planning Toolkit	GitHub Projects	Linear	Raw Prompt Engineering
AI-native design	✅ Core architecture	❌ Manual only	❌ Manual only	⚠️ Ad-hoc
Cross-session persistence	✅ Automatic	✅ Via issues	✅ Via issues	❌ None
Version control integration	✅ Native markdown	⚠️ API-dependent	❌ Proprietary	❌ None
Agent behavior modification	✅ Bundled skills	❌ None	❌ None	⚠️ Prompt-dependent
Inline spec expansion	✅ Automatic	❌ N/A	❌ N/A	❌ Manual
Zero vendor lock-in	✅ Pure markdown	⚠️ GitHub-dependent	❌ SaaS lock-in	✅ Text files
Setup complexity	2 minutes	10+ minutes	15+ minutes	Hours of tuning

The decisive advantage: GitHub Projects and Linear excel at human project management. The Planning Toolkit is architected for agent coordination. The bundled skill system, automatic prompt injection, and markdown-native design create a category that doesn't meaningfully exist elsewhere.

Raw prompt engineering? You'll spend more time managing prompts than building product. The toolkit abstracts that complexity into reusable infrastructure.

FAQ: What Developers Actually Ask

Q: Does this work with Claude, GPT-4, or other models? A: The toolkit operates at the OpenCode editor level, which supports multiple model backends. The bundled skill's instructions are model-agnostic markdown that any capable LLM interprets.

Q: What happens if two agents try to modify the same plan simultaneously? A: Standard file-system concurrency applies. The toolkit doesn't implement distributed locking — for high-contention scenarios, wrap plan modifications in your VCS workflow (branch → modify → merge).

Q: Can I use this without OpenCode? A: The core primitives (markdown specs/plans) are universal. However, the bundled skill and automatic prompt injection require OpenCode's plugin architecture. Adaptation to other editors would need custom implementation.

Q: How does this scale to 50+ active plans? A: The <available_plans> injection includes all plan names and descriptions, which consumes context window. For massive projects, implement a hierarchical structure: epic plans → sub-plans, or archive completed work aggressively.

Q: Is there a migration path from Jira/Linear/Asana? A: Manual export to markdown. The simplicity is intentional — no complex import scripts because the format is deliberately minimal. Most teams find the migration cathartic (forced simplification).

Q: Can agents create plans without human approval? A: Yes, if your OpenCode configuration allows autonomous tool use. The bundled skill guides how to plan, but your permission settings govern whether. Configure according to your risk tolerance.

Q: What about plan dependencies and blocking relationships? A: Not natively implemented. Use naming conventions (01-auth.md, 02-billing.md) or explicit "Blocked by: X" lines in plan Idea sections. The toolkit prioritizes simplicity over exhaustive features.

Conclusion: The Structured Future Is Already Here

The OpenCode Planning Toolkit isn't flashy. It won't generate viral demos of agents writing code in split-second montages. What it does is far more valuable: it makes AI agents reliable teammates instead of unpredictable freelancers.

Igor Warzocha identified a genuine infrastructure gap and filled it with elegant simplicity. The markdown-native approach, bundled skill system, and cross-session persistence solve problems that every scaling AI-assisted team encounters. No lock-in. No complexity theater. Just working software.

My assessment? This is foundational tooling that will seem obvious in retrospect. The teams adopting it now are building competitive advantages in development velocity and code quality that compound over months.

Stop letting your AI agents wander blind. Install the OpenCode Planning Toolkit today. Create your first spec. Watch your agents finally finish what they start. The chaos ends when you decide it does — and that decision takes about 120 seconds.

Your future self, reviewing clean git log history and completed plan archives, will thank you.