How to Create AI Agent Skills and Playbooks with Markdown: Stop Hardcoding Agent Behavior
Problem
My AI agents kept breaking. Every time I needed to change their behavior, I had to:
- Open Python files
- Find the right function
- Modify hardcoded instructions
- Redeploy the entire service
When I wanted my code reviewer agent to also check for security issues, I had to duplicate logic. When I wanted my research agent to use a new tool, I had to modify its core behavior file. When a non-technical teammate wanted to adjust how an agent responds, they couldn’t—they don’t write Python.
Here’s what my agent code looked like:
class CodeReviewerAgent: def __init__(self): self.instructions = """ You are a code reviewer. Check for: - Code quality - Bugs - Style issues """
def review(self, code): # Hardcoded behavior issues = self.llm.call(self.instructions, code) return issues
class ResearchAgent: def __init__(self): self.instructions = """ You are a research assistant. Search for information and summarize. """
def research(self, query): # Duplicate logic with slight variations results = self.llm.call(self.instructions, query) return resultsEvery agent had its own copy of similar logic. Adding a new behavior meant writing more code. Sharing behaviors between agents was painful.
What happened?
I discovered that modern AI systems can read markdown files and follow them natively. Instead of hardcoding agent instructions in Python, I could write them as markdown files that the AI interprets at runtime.
The key insight: Markdown-based skills separate the “what” from the “how.” The instructions live in readable, editable markdown files. The code just loads and passes them to the AI.
This changed everything:
- Non-technical people can edit agent behavior by editing markdown
- Behaviors can be version controlled separately from code
- Agents can modify their own instructions at runtime
- Sharing behaviors becomes as simple as sharing a file
The Solution: Skill Files
A skill file is a markdown document that defines what an agent should do. Here’s the basic structure I use:
# Skill Name
## Purpose[What this skill does and when to use it]
## Instructions[Step-by-step instructions for the AI to follow]
## Tools Available- Tool 1: [description]- Tool 2: [description]
## Constraints- [What not to do]- [Edge cases to handle]
## Examples[Concrete examples of expected behavior]Let me show you how I converted my code reviewer agent:
# Code Review Skill
## PurposePerform systematic code reviews focusing on quality, security, and maintainability. Use this skill whenever code has been written or modified.
## Instructions1. Read the provided code file completely2. Check for CRITICAL issues first: - Security vulnerabilities (SQL injection, XSS, hardcoded secrets) - Logic errors that could cause runtime failures3. Check for HIGH priority issues: - Missing error handling - Performance bottlenecks - Memory leaks4. Check for MEDIUM priority issues: - Code duplication - Poor naming conventions - Missing documentation5. Provide actionable recommendations with code examples
## Output Format~~~## Summary[2-3 sentence overview]
## Issues Found| Priority | Issue | Location | Recommendation ||----------|-------|----------|----------------|| CRITICAL | ... | ... | ... |
## Suggested Changes[Code blocks with improvements]~~~
## Constraints- Never approve code with CRITICAL security issues- Always provide specific line numbers for issues- Include positive feedback on well-written codeNow my agent code just loads this file:
class SkillBasedAgent: def __init__(self, skill_path: str): with open(skill_path) as f: self.skill = f.read()
def run(self, context: str) -> str: return self.llm.call(self.skill, context)
# Usagereviewer = SkillBasedAgent("~/.claude/skills/code-reviewer/skill.md")result = reviewer.run(code_to_review)The behavior is now separate from the code. I can edit the markdown file without touching Python.
Playbooks for Complex Workflows
A skill handles a single task. A playbook handles multi-step workflows with phases and dependencies.
# Playbook: Technical Research
## TriggerUser asks for research on a technical topic or when implementing unfamiliar features.
## Phases
### Phase 1: Information Gathering- Input: Research topic from user- Actions: 1. Search official documentation using context7 MCP 2. Search recent articles using web-search MCP 3. Search GitHub for relevant code examples- Output: Collected sources and initial findings
### Phase 2: Analysis- Dependencies: Phase 1 output- Actions: 1. Compare approaches from different sources 2. Identify best practices and anti-patterns 3. Evaluate relevance to current project context- Output: Structured analysis with recommendations
### Phase 3: Synthesis- Dependencies: Phase 2 analysis- Actions: 1. Create implementation recommendations 2. Identify potential risks and mitigations 3. Suggest next steps for user- Output: Actionable research summary
## Success Criteria- At least 3 authoritative sources consulted- Clear comparison of different approaches- Actionable recommendations providedThe playbook pattern gives the AI a structured workflow. Each phase has clear inputs, outputs, and dependencies. This prevents the AI from jumping around or missing steps.
Directory Structure
Here’s how I organize my skills:
~/.claude/skills/├── code-reviewer/│ └── skill.md # Code review skill├── security-reviewer/│ └── skill.md # Security-focused review├── planner/│ └── skill.md # Research and planning├── tdd-guide/│ └── skill.md # Test-driven development└── refactor-cleaner/ └── skill.md # Code cleanupEach skill is self-contained in its own directory. This makes it easy to share, version, and modify individual skills.
Self-Modification: Agents That Improve Themselves
The most powerful aspect of markdown-based skills is that agents can edit them at runtime. I added this pattern to my skills:
## Self-Modification Capability
### When to Modify- After repeated failures on similar tasks- When user feedback indicates improvement needed- When environment constraints change
### How to Modify1. Read current configuration file2. Identify specific section to update3. Propose changes to user for approval4. Apply approved changes to markdown file5. Log modification in changelog sectionI tested this with my code reviewer. After it failed to catch a specific pattern three times, it suggested:
I've noticed I keep missing React useEffect cleanup functions. Should I add this to my CRITICAL checks?
Proposed addition to skill.md:
## Instructions (addition)2. Check for CRITICAL issues first: - Security vulnerabilities (SQL injection, XSS, hardcoded secrets) - Logic errors that could cause runtime failures - Missing useEffect cleanup in React components <-- NEWI approved, and it modified its own skill file. The next time, it caught the issue.
Why This Matters
Separating instructions from code changed how I work with AI agents:
Separation of concerns: Instructions live in markdown, logic in code. Each can evolve independently.
Non-technical contributions: My product manager can adjust how the research agent works by editing a markdown file. No Python knowledge required.
Version control: I track changes to agent behavior separately from codebase changes. I can see exactly when a behavior changed and why.
Runtime flexibility: I can modify agent behavior without redeploying. The AI reads the file fresh each time.
Agent self-improvement: Agents can suggest and apply their own improvements based on experience.
Cross-platform portability: The same skill files work with Claude Code, LangChain, or any AI engine that supports markdown instructions.
Common Mistakes
I made several mistakes when starting with markdown skills:
Mistake 1: Over-engineering the structure
# Complex Skill System
## Metadata- Version: 1.0.3- Author: System- Created: 2024-01-15- Dependencies: [list of 20 items]- Configuration: [nested config]
## Abstract[Long theoretical description]
## Theoretical Framework[Academic explanations]Keep it simple. The AI reads instructions, not documentation about instructions.
# Code Review Skill
## PurposeReview code for quality, security, and maintainability.
## Instructions1. Check for CRITICAL issues (security, logic errors)2. Check for HIGH issues (error handling, performance)3. Check for MEDIUM issues (style, documentation)Mistake 2: Missing constraints
Without constraints, the AI will do things you don’t expect.
## InstructionsReview the code and provide feedback.## Constraints- Never approve code with hardcoded credentials- Maximum review time: 5 minutes- Focus only on the specified file, not dependencies- If file >500 lines, suggest splitting firstMistake 3: No examples
Examples show the AI exactly what you want.
## Output FormatProvide a summary and list of issues.## Output Format~~~## SummaryFound 2 critical issues in authentication flow.
## Issues Found| Priority | Issue | Location | Recommendation ||----------|-------|----------|----------------|| CRITICAL | SQL injection | auth.py:45 | Use parameterized queries || HIGH | Missing error handling | auth.py:67 | Add try-catch block |~~~Mistake 4: Hardcoded values in markdown
Don’t put specific values in the skill file.
## InstructionsReview files in /home/user/projects/myapp/src/## InstructionsReview files in {project_path}The user will provide {project_path} at runtime.Integration with Claude Code
Claude Code CLI natively supports this skills system. I reference my skills in my project’s CLAUDE.md:
## Available Skills- Use `code-reviewer` when: After writing new code- Use `security-reviewer` when: Before committing sensitive changes- Use `planner` when: Starting complex features- Use `tdd-guide` when: Writing tests first
## Skill InvocationWhen a trigger condition matches, automatically load and follow the corresponding skill file from ~/.claude/skills/{skill-name}/skill.mdNow when I write code and Claude detects it should review, it loads the code-reviewer skill and follows those instructions.
Summary
In this post, I showed how to create AI agent skills and playbooks using markdown instead of hardcoding behavior in code. The key point is that modern AI engines can natively interpret markdown instructions, which separates behavior definition from implementation code.
I converted my hardcoded Python agents to skill-based agents with markdown files. This enabled non-technical editing, version control of behaviors, and even self-modification by the agents themselves. The skill structure includes purpose, instructions, tools, constraints, and examples. For complex workflows, playbooks add phases with inputs, outputs, and dependencies.
Start with a simple skill file for your most common agent task. Once you see the pattern work, expand into multi-phase playbooks. Your agents will become more maintainable, more flexible, and capable of improving themselves.
Final Words + More Resources
My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me
Here are also the most important links from this article along with some further resources that will help you in this scope:
Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!
Comments