Skip to content

How to Build Custom Claude Code Skills with Role-Based Prompting

I kept repeating the same instructions to Claude Code: “Write tests first, check coverage, use this naming pattern, review the code.” Every session felt like I was retraining Claude from scratch.

Then I discovered skills. A skill is a reusable instruction package that Claude loads automatically when triggered. But the real insight came from a Reddit thread about gstack: the key pattern isn’t just having instructions - it’s giving AI agents distinct roles in separate context windows.

The Problem: Context Repetition

Every time I started a new coding session, I had to:

  • Explain my testing requirements (TDD, 80% coverage)
  • Describe my naming conventions
  • Outline my review process
  • Specify my security checklist

This wasted tokens and time. Worse, Claude sometimes forgot mid-session and I had to remind it again.

I tried putting instructions in a README file, but Claude wouldn’t consistently read it. I tried adding comments to my code, but that only helped for specific files.

The Solution: SKILL.md Files

Skills solve this problem with a simple structure:

Skill directory structure
skill-name/
├── SKILL.md (required)
│ ├── YAML frontmatter (name + description)
│ └── Markdown instructions
├── scripts/ (optional - executable code)
├── references/ (optional - documentation)
└── assets/ (optional - templates, files)

The critical part is the YAML frontmatter:

SKILL.md frontmatter
---
name: tdd-workflow
description: Use this skill when writing new features, fixing bugs, or refactoring code. Enforces test-driven development with 80%+ coverage including unit, integration, and E2E tests.
---

I initially made a mistake with the description field. I wrote:

WRONG: Description missing trigger context
description: Test-driven development workflow with 80% coverage requirements.

This was too vague. Claude didn’t know when to use the skill. The description needs two things: what the skill does AND when to trigger it.

CORRECT: Description with trigger context
description: Use this skill when writing new features, fixing bugs, or refactoring code. Enforces test-driven development with 80%+ coverage including unit, integration, and E2E tests.

Now Claude automatically loads this skill whenever I mention “new feature” or “fix bug”.

Progressive Disclosure: Three Loading Levels

Skills use a three-level loading system that I didn’t understand at first:

Skill loading levels
Level 1: Metadata (name + description)
→ Always in context (~100 words)
→ Claude sees this for every skill
Level 2: SKILL.md body
→ Loaded when skill triggers (<5k words)
→ Full instructions, patterns, examples
Level 3: Bundled resources
→ Loaded as needed (scripts, references)
→ Unlimited because scripts execute without context

This means the description field is the trigger. I wasted time writing detailed “When to Use” sections in the SKILL.md body, but Claude never saw them because the body only loads after triggering.

Role-Based Prompting: The gstack Insight

A Reddit thread about gstack revealed the deeper pattern. The commenter noted:

“The insight that you get better output by separating planning, review, and QA into distinct context windows is valuable.”

This isn’t about one skill doing everything. It’s about multiple skills with distinct roles:

Role-based skill architecture
┌─────────────┐
│ Planner │ → Think → Plan
│ Skill │ (separate context)
└─────────────┘
↓ (output passed to next)
┌─────────────┐
│ Builder │ → Build → Code
│ Skill │ (fresh context)
└─────────────┘
↓ (output passed to next)
┌─────────────┐
│ Reviewer │ → Review → QA
│ Skill │ (fresh context)
└─────────────┘

The gstack approach used roles like CEO, Engineering Manager, Designer, Reviewer & QA Lead, Security Officer. Each role operates in a clean context window, preventing the confusion that happens when one agent tries to do everything.

I adapted this for my workflow:

RoleSkillTrigger
Plannerplanning-with-files”plan implementation”
Buildertdd-workflow”new feature” or “fix bug”
Reviewercode-reviewer”review code”
Securitysecurity-review”security check”

My First Skill: TDD Workflow

Here’s the skill I built first:

SKILL.md
---
name: tdd-workflow
description: Use this skill when writing new features, fixing bugs, or refactoring code. Enforces test-driven development with 80%+ coverage including unit, integration, and E2E tests.
---
# Test-Driven Development Workflow
## Core Principles
### 1. Tests BEFORE Code
ALWAYS write tests first, then implement code to make tests pass.
### 2. Coverage Requirements
- Minimum 80% coverage (unit + integration + E2E)
- All edge cases covered
- Error scenarios tested
### 3. Test Types
#### Unit Tests
- Individual functions and utilities
- Pure functions and helpers
#### Integration Tests
- API endpoints
- Database operations
#### E2E Tests
- Critical user flows
- Complete workflows
## TDD Workflow Steps
1. Write user journey: "As a [role], I want to [action]"
2. Generate test cases for each journey
3. Run tests (they should fail)
4. Implement minimal code
5. Run tests (they should pass)
6. Refactor
7. Verify 80%+ coverage

The key insight: I put ONLY the essential workflow in SKILL.md. Detailed patterns and examples went in a separate references file.

Separating Content: The Reference Pattern

I initially wrote a 400-line SKILL.md with every testing pattern I knew. This violated the “keep SKILL.md under 500 lines” guideline and bloated the context window.

The solution was splitting content:

Split skill content structure
tdd-workflow/
├── SKILL.md (core workflow only - 80 lines)
├── references/
│ ├── unit-test-patterns.md (detailed examples)
│ ├── integration-patterns.md (API testing patterns)
│ └── e2e-patterns.md (Playwright examples)

SKILL.md now references these files:

SKILL.md with references
## Testing Patterns
- **Unit Tests**: See [unit-test-patterns.md](references/unit-test-patterns.md)
- **Integration Tests**: See [integration-patterns.md](references/integration-patterns.md)
- **E2E Tests**: See [e2e-patterns.md](references/e2e-patterns.md)

Claude loads reference files only when needed, saving context for other tasks.

When to Use Scripts vs. Text Instructions

I struggled with deciding when to write a script versus when to write text instructions.

The guideline is:

Freedom LevelFormatWhen to Use
High freedomText instructionsMultiple approaches valid, context-dependent decisions
Medium freedomPseudocode with parametersPreferred pattern exists, some variation OK
Low freedomScripts with few parametersOperations fragile, consistency critical

Example:

Script vs. instruction decision
Task: Run tests and check coverage
→ High freedom: "Run tests, verify 80% coverage"
→ Text instruction works
Task: Rotate PDF pages
→ Low freedom: Must work exactly the same every time
→ Script: scripts/rotate_pdf.py
Task: Create new API endpoint
→ Medium freedom: Pattern exists, details vary
→ Pseudocode template with parameters

My Skill Creation Process

I followed this workflow after reading the skill-creator documentation:

Step 1: Understand with Concrete Examples

I listed specific scenarios where I wanted Claude to behave consistently:

  • “When I say ‘add feature’, write tests first”
  • “When I say ‘fix bug’, write test for bug, then fix”
  • “When I say ‘refactor’, ensure tests still pass”

Step 2: Plan Reusable Contents

For each scenario, I asked: “What would Claude need every time?”

  • Test patterns → references/
  • Coverage commands → SKILL.md
  • Mock templates → references/

Step 3: Initialize with Script

I used the init script:

Initialize new skill
scripts/init_skill.py tdd-workflow --path ~/.claude/skills/

This created the directory structure with placeholder files.

Step 4: Edit SKILL.md

I wrote the frontmatter description first (the trigger), then the body instructions.

Step 5: Package the Skill

Package skill for distribution
scripts/package_skill.py ~/.claude/skills/tdd-workflow

This validates the skill and creates a distributable .skill file.

Step 6: Iterate Based on Usage

After using the skill, I noticed Claude wasn’t always running tests first. I added this to SKILL.md:

Added enforcement
### 1. Tests BEFORE Code
ALWAYS write tests first. If Claude starts implementing before tests, STOP and write tests.

The word “ALWAYS” and “STOP” made the instruction more enforceable.

Common Mistakes I Made

Mistake 1: Description too abstract

WRONG: Abstract description
description: Testing workflow for quality code.

Claude never triggered this. I fixed it with concrete trigger phrases.

Mistake 2: “When to Use” in body instead of description

I wrote a “When to Use This Skill” section in SKILL.md body. But the body only loads after the skill triggers, so Claude never saw these instructions.

Mistake 3: All content in one file

My 400-line SKILL.md bloated context. I split it into references.

Mistake 4: No enforcement language

I wrote “Write tests first” and Claude sometimes ignored it. I changed to “ALWAYS write tests first. STOP if implementing before tests.”

Mistake 5: Wrong scope for resources

I put project-specific schemas in a user-scope skill. Now I use project-scope for project-specific content.

Sprint Process: The Complete Workflow

The gstack thread described a “Sprint-as-a-process” framework:

Sprint process workflow
Think → Plan → Build → Review → Test → Ship → Reflect

I translated this into skill triggers:

Skill triggers for sprint process
1. Think: /think-mode (built-in)
2. Plan: "plan implementation" → planning skill
3. Build: "new feature" → tdd-workflow skill
4. Review: "review code" → code-reviewer skill
5. Test: Built into tdd-workflow
6. Ship: "commit" → git workflow
7. Reflect: "what did we learn" → summary

Each phase has a fresh context window, preventing the confusion that happens when one agent does everything.

What Changed After Building Skills

Before skills:

  • Repeated instructions every session
  • Claude forgot preferences mid-session
  • Inconsistent test coverage
  • Manual review checklist enforcement

After skills:

  • Claude loads TDD workflow automatically
  • Coverage stays above 80%
  • Review happens after every feature
  • Security checks are consistent

Summary

Building custom Claude Code skills involves:

  1. SKILL.md with proper frontmatter: Name and description (description is the trigger)
  2. Role-based prompting: Multiple skills with distinct roles, not one skill doing everything
  3. Progressive disclosure: Metadata always visible, body when triggered, references as needed
  4. Split content: Core workflow in SKILL.md, details in references
  5. Appropriate freedom level: Text for high freedom, scripts for low freedom

The most transferable lesson from the gstack discussion isn’t the tool itself - it’s the pattern: give AI agents distinct roles, structured processes, and clear boundaries.

Final Words + More Resources

My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me

Here are also the most important links from this article along with some further resources that will help you in this scope:

Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!

Comments