Skip to content

Progressive Discovery in OpenAI Codex: How Agents Find and Use Skills

What is Progressive Discovery?

Progressive discovery is how Codex agents find relevant Skills during interaction—not by explicit invocation, but by matching Skill descriptions to the current task context.

This means your Skill’s description is more critical than the prompt itself: a well-written description ensures the agent discovers your Skill at the right moment; a poor description means your Skill never gets used.

The Core Mechanism

Unlike explicit API calls where you call a function by name, Skills are discovered contextually. The agent reads descriptions and decides relevance based on what’s happening in the conversation.

Discovery Flow
+-------------------+ +---------------------+ +------------------+
| User Request | | Agent Evaluates | | Skill Selected |
| "test my code" | --> | Skill Descriptions | --> | (if match found)|
+-------------------+ +---------------------+ +------------------+
|
v
+-------------------+
| Description: |
| "Generate tests" |
| ✓ MATCH |
+-------------------+

The agent evaluates descriptions against the current context. If your description doesn’t match the task language, your Skill stays invisible.

Why Description Matters Most

Here’s what I learned from the official documentation:

  1. Skills are workflow prompts that agents “progressively discover” during interaction
  2. Progressive discovery mechanism means Skill description quality directly determines if agent finds it
  3. Description writing is more critical than the instruction content itself

This is a fundamental shift from explicit tool calling to contextual discovery. You’re not writing documentation for humans—you’re writing signals for an AI to match.

How to Write Discoverable Descriptions

The Wrong Way

bad-skill.md
---
description: Helps with coding tasks
---

This description is useless. “Coding tasks” is too vague. The agent won’t know when to use this Skill.

A Better Attempt

better-skill.md
---
description: Generates comprehensive unit tests for TypeScript functions
by analyzing function signatures, edge cases, and integration points.
Use when you need test coverage for new or existing functions.
---

Better—it includes specific terms like “TypeScript”, “unit tests”, “function signatures”. But we can improve further.

The Optimal Approach

good-skill.md
---
name: typescript-test-generator
description: |
Generate Jest unit tests for TypeScript functions.
Triggers: test generation, unit tests, jest, typescript testing,
function coverage, test-first development.
Use when: creating tests, improving coverage, TDD workflow.
---

This description includes:

  • Trigger words: “test generation”, “unit tests”, “jest”, “typescript testing”
  • Domain terms: “function coverage”, “TDD workflow”
  • Clear context: “Use when” section tells the agent exactly when to invoke

Best Practices for Discovery

I follow these rules when writing Skill descriptions:

1. Match Natural Language Queries

Think about how users phrase requests. “Test my code”, “I need tests”, “add unit tests”—these are the patterns your description should match.

natural-language-matching.md
---
description: |
Create unit tests for your code. Handles Jest, Vitest, and Mocha.
Triggers: test my code, add tests, write tests, unit tests,
test coverage, testing.
---

2. Include Domain Terminology

Agents look for domain-specific terms. If your Skill works with React, include “React”, “components”, “hooks” in the description.

domain-terms.md
---
description: |
Generate React component tests with Testing Library.
Triggers: react test, component test, testing library,
react testing, render test.
Use when: testing React components, writing component specs.
---

3. Be Specific About When the Skill Applies

Don’t make the agent guess. Tell it exactly when this Skill should be used.

specific-context.md
---
description: |
Analyze Python code for security vulnerabilities.
Triggers: security check, vulnerability scan, code security,
python security, bandit, security audit.
Use when: reviewing code for security issues, auditing Python
applications, checking for vulnerabilities.
---

4. Test With Real Agent Interactions

Write your description, then interact with the agent. Does it discover your Skill when you expect? If not, iterate on the description.

Common Mistakes to Avoid

Vague Descriptions

vague-example.md
---
description: Helps with code
---

The agent has no idea what this means. Avoid generic terms.

Missing Domain Terminology

missing-domain.md
---
description: Generate tests
---

Which framework? What language? Be specific.

Overly Long Descriptions

too-long.md
---
description: |
This comprehensive Skill analyzes your entire codebase,
examines every function, determines the optimal testing strategy,
considers various edge cases including null values, empty strings,
boundary conditions, and generates a complete test suite that
covers all possible scenarios... (continues for 20 more lines)
---

Long descriptions dilute key signals. Keep it focused.

Technical Jargon the Agent Won’t Match

jargon.md
---
description: Implements TDD-BDD hybrid methodology with CI/CD integration
---

Users don’t ask for “TDD-BDD hybrid methodology”. They say “test my code”. Use natural language.

The Discovery Mindset

When I write Skill descriptions, I think about discovery, not documentation:

Discovery Mindset
Traditional Documentation: "Here's what this Skill does"
Discovery-Oriented: "Here's when to use this Skill"
Traditional: "Skill: test-generator"
Discovery: "Use when: user wants tests, needs coverage, asks about testing"

The agent doesn’t need to know what your Skill does—it needs to know when to use it.

A Practical Checklist

Before publishing any Skill, I check:

  • Description includes trigger words users would say
  • Domain-specific terms are present
  • “Use when” section clearly defines context
  • Description is under 200 characters (shorter = stronger signal)
  • No generic terms like “helps”, “assists”, “supports”
  • Tested with actual agent interaction

Key Takeaway

In progressive discovery, your Skill is only as good as its description. The most powerful Skill with a poor description remains invisible. Invest time writing descriptions that match how agents evaluate context—this determines whether your Skill gets discovered and used.

Final Words + More Resources

My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me

Here are also the most important links from this article along with some further resources that will help you in this scope:

Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!

Comments