Skip to content

How to Create AI Agent Skills and Playbooks with Markdown: Stop Hardcoding Agent Behavior

Problem

My AI agents kept breaking. Every time I needed to change their behavior, I had to:

  1. Open Python files
  2. Find the right function
  3. Modify hardcoded instructions
  4. Redeploy the entire service

When I wanted my code reviewer agent to also check for security issues, I had to duplicate logic. When I wanted my research agent to use a new tool, I had to modify its core behavior file. When a non-technical teammate wanted to adjust how an agent responds, they couldn’t—they don’t write Python.

Here’s what my agent code looked like:

agent.py (BEFORE)
class CodeReviewerAgent:
def __init__(self):
self.instructions = """
You are a code reviewer. Check for:
- Code quality
- Bugs
- Style issues
"""
def review(self, code):
# Hardcoded behavior
issues = self.llm.call(self.instructions, code)
return issues
class ResearchAgent:
def __init__(self):
self.instructions = """
You are a research assistant. Search for information and summarize.
"""
def research(self, query):
# Duplicate logic with slight variations
results = self.llm.call(self.instructions, query)
return results

Every agent had its own copy of similar logic. Adding a new behavior meant writing more code. Sharing behaviors between agents was painful.

What happened?

I discovered that modern AI systems can read markdown files and follow them natively. Instead of hardcoding agent instructions in Python, I could write them as markdown files that the AI interprets at runtime.

The key insight: Markdown-based skills separate the “what” from the “how.” The instructions live in readable, editable markdown files. The code just loads and passes them to the AI.

This changed everything:

  • Non-technical people can edit agent behavior by editing markdown
  • Behaviors can be version controlled separately from code
  • Agents can modify their own instructions at runtime
  • Sharing behaviors becomes as simple as sharing a file

The Solution: Skill Files

A skill file is a markdown document that defines what an agent should do. Here’s the basic structure I use:

skill.md
# Skill Name
## Purpose
[What this skill does and when to use it]
## Instructions
[Step-by-step instructions for the AI to follow]
## Tools Available
- Tool 1: [description]
- Tool 2: [description]
## Constraints
- [What not to do]
- [Edge cases to handle]
## Examples
[Concrete examples of expected behavior]

Let me show you how I converted my code reviewer agent:

~/.claude/skills/code-reviewer/skill.md
# Code Review Skill
## Purpose
Perform systematic code reviews focusing on quality, security, and maintainability. Use this skill whenever code has been written or modified.
## Instructions
1. Read the provided code file completely
2. Check for CRITICAL issues first:
- Security vulnerabilities (SQL injection, XSS, hardcoded secrets)
- Logic errors that could cause runtime failures
3. Check for HIGH priority issues:
- Missing error handling
- Performance bottlenecks
- Memory leaks
4. Check for MEDIUM priority issues:
- Code duplication
- Poor naming conventions
- Missing documentation
5. Provide actionable recommendations with code examples
## Output Format
~~~
## Summary
[2-3 sentence overview]
## Issues Found
| Priority | Issue | Location | Recommendation |
|----------|-------|----------|----------------|
| CRITICAL | ... | ... | ... |
## Suggested Changes
[Code blocks with improvements]
~~~
## Constraints
- Never approve code with CRITICAL security issues
- Always provide specific line numbers for issues
- Include positive feedback on well-written code

Now my agent code just loads this file:

agent.py (AFTER)
class SkillBasedAgent:
def __init__(self, skill_path: str):
with open(skill_path) as f:
self.skill = f.read()
def run(self, context: str) -> str:
return self.llm.call(self.skill, context)
# Usage
reviewer = SkillBasedAgent("~/.claude/skills/code-reviewer/skill.md")
result = reviewer.run(code_to_review)

The behavior is now separate from the code. I can edit the markdown file without touching Python.

Playbooks for Complex Workflows

A skill handles a single task. A playbook handles multi-step workflows with phases and dependencies.

~/.claude/skills/planner/skill.md
# Playbook: Technical Research
## Trigger
User asks for research on a technical topic or when implementing unfamiliar features.
## Phases
### Phase 1: Information Gathering
- Input: Research topic from user
- Actions:
1. Search official documentation using context7 MCP
2. Search recent articles using web-search MCP
3. Search GitHub for relevant code examples
- Output: Collected sources and initial findings
### Phase 2: Analysis
- Dependencies: Phase 1 output
- Actions:
1. Compare approaches from different sources
2. Identify best practices and anti-patterns
3. Evaluate relevance to current project context
- Output: Structured analysis with recommendations
### Phase 3: Synthesis
- Dependencies: Phase 2 analysis
- Actions:
1. Create implementation recommendations
2. Identify potential risks and mitigations
3. Suggest next steps for user
- Output: Actionable research summary
## Success Criteria
- At least 3 authoritative sources consulted
- Clear comparison of different approaches
- Actionable recommendations provided

The playbook pattern gives the AI a structured workflow. Each phase has clear inputs, outputs, and dependencies. This prevents the AI from jumping around or missing steps.

Directory Structure

Here’s how I organize my skills:

Directory Structure
~/.claude/skills/
├── code-reviewer/
│ └── skill.md # Code review skill
├── security-reviewer/
│ └── skill.md # Security-focused review
├── planner/
│ └── skill.md # Research and planning
├── tdd-guide/
│ └── skill.md # Test-driven development
└── refactor-cleaner/
└── skill.md # Code cleanup

Each skill is self-contained in its own directory. This makes it easy to share, version, and modify individual skills.

Self-Modification: Agents That Improve Themselves

The most powerful aspect of markdown-based skills is that agents can edit them at runtime. I added this pattern to my skills:

Self-Modification Pattern
## Self-Modification Capability
### When to Modify
- After repeated failures on similar tasks
- When user feedback indicates improvement needed
- When environment constraints change
### How to Modify
1. Read current configuration file
2. Identify specific section to update
3. Propose changes to user for approval
4. Apply approved changes to markdown file
5. Log modification in changelog section

I tested this with my code reviewer. After it failed to catch a specific pattern three times, it suggested:

I've noticed I keep missing React useEffect cleanup functions. Should I add this to my CRITICAL checks?
Proposed addition to skill.md:
## Instructions (addition)
2. Check for CRITICAL issues first:
- Security vulnerabilities (SQL injection, XSS, hardcoded secrets)
- Logic errors that could cause runtime failures
- Missing useEffect cleanup in React components <-- NEW

I approved, and it modified its own skill file. The next time, it caught the issue.

Why This Matters

Separating instructions from code changed how I work with AI agents:

Separation of concerns: Instructions live in markdown, logic in code. Each can evolve independently.

Non-technical contributions: My product manager can adjust how the research agent works by editing a markdown file. No Python knowledge required.

Version control: I track changes to agent behavior separately from codebase changes. I can see exactly when a behavior changed and why.

Runtime flexibility: I can modify agent behavior without redeploying. The AI reads the file fresh each time.

Agent self-improvement: Agents can suggest and apply their own improvements based on experience.

Cross-platform portability: The same skill files work with Claude Code, LangChain, or any AI engine that supports markdown instructions.

Common Mistakes

I made several mistakes when starting with markdown skills:

Mistake 1: Over-engineering the structure

Over-engineered (BAD)
# Complex Skill System
## Metadata
- Version: 1.0.3
- Author: System
- Created: 2024-01-15
- Dependencies: [list of 20 items]
- Configuration: [nested config]
## Abstract
[Long theoretical description]
## Theoretical Framework
[Academic explanations]

Keep it simple. The AI reads instructions, not documentation about instructions.

Simple (GOOD)
# Code Review Skill
## Purpose
Review code for quality, security, and maintainability.
## Instructions
1. Check for CRITICAL issues (security, logic errors)
2. Check for HIGH issues (error handling, performance)
3. Check for MEDIUM issues (style, documentation)

Mistake 2: Missing constraints

Without constraints, the AI will do things you don’t expect.

Missing Constraints (BAD)
## Instructions
Review the code and provide feedback.
With Constraints (GOOD)
## Constraints
- Never approve code with hardcoded credentials
- Maximum review time: 5 minutes
- Focus only on the specified file, not dependencies
- If file >500 lines, suggest splitting first

Mistake 3: No examples

Examples show the AI exactly what you want.

Without Examples (BAD)
## Output Format
Provide a summary and list of issues.
With Examples (GOOD)
## Output Format
~~~
## Summary
Found 2 critical issues in authentication flow.
## Issues Found
| Priority | Issue | Location | Recommendation |
|----------|-------|----------|----------------|
| CRITICAL | SQL injection | auth.py:45 | Use parameterized queries |
| HIGH | Missing error handling | auth.py:67 | Add try-catch block |
~~~

Mistake 4: Hardcoded values in markdown

Don’t put specific values in the skill file.

Hardcoded (BAD)
## Instructions
Review files in /home/user/projects/myapp/src/
Parameterized (GOOD)
## Instructions
Review files in {project_path}
The user will provide {project_path} at runtime.

Integration with Claude Code

Claude Code CLI natively supports this skills system. I reference my skills in my project’s CLAUDE.md:

CLAUDE.md
## Available Skills
- Use `code-reviewer` when: After writing new code
- Use `security-reviewer` when: Before committing sensitive changes
- Use `planner` when: Starting complex features
- Use `tdd-guide` when: Writing tests first
## Skill Invocation
When a trigger condition matches, automatically load and follow the corresponding skill file from ~/.claude/skills/{skill-name}/skill.md

Now when I write code and Claude detects it should review, it loads the code-reviewer skill and follows those instructions.

Summary

In this post, I showed how to create AI agent skills and playbooks using markdown instead of hardcoding behavior in code. The key point is that modern AI engines can natively interpret markdown instructions, which separates behavior definition from implementation code.

I converted my hardcoded Python agents to skill-based agents with markdown files. This enabled non-technical editing, version control of behaviors, and even self-modification by the agents themselves. The skill structure includes purpose, instructions, tools, constraints, and examples. For complex workflows, playbooks add phases with inputs, outputs, and dependencies.

Start with a simple skill file for your most common agent task. Once you see the pattern work, expand into multi-phase playbooks. Your agents will become more maintainable, more flexible, and capable of improving themselves.

Final Words + More Resources

My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me

Here are also the most important links from this article along with some further resources that will help you in this scope:

Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!

Comments