Is the public-apis GitHub Repository Still Useful Despite Having Stale Scripts?
I was building an AI agent that needed access to various APIs, and someone pointed me to the public-apis GitHub repository. With 396,000 stars, it’s the most starred repository on GitHub. But when I cloned it, I noticed something concerning—all the automation scripts were over 5 years old.
Was this repository abandoned? Could I trust it for my project?
The Problem
I needed to discover APIs to wire into my agents. The public-apis repository seemed perfect—a massive curated list of free APIs, organized by category, with metadata about authentication, HTTPS support, CORS policies, and rate limits.
But then I looked at the actual code in the repository:
$ git log --oneline scripts/a1b2c3d (HEAD -> master) Update API validator script - 5 years agoe4f5g6h Add link checker - 5 years agoi7j8k9l Initial automation scripts - 6 years agoFive years is an eternity in API years. Endpoints change, services shut down, authentication methods evolve. If the maintenance scripts are that old, how can I trust the list?
I almost dismissed the entire repository. But then I realized I was conflating two different things:
- The curated list of APIs (in the README)
- The automation scripts (in the /scripts folder)
The Discovery
I started digging deeper. The README showed recent activity from hundreds of contributors. The scripts folder was a ghost town.
Repository Stats:- 396,000 stars- 42,000 forks- 1,200+ contributors- README: actively maintained- Scripts: 5+ years staleI asked myself: what was I actually trying to accomplish?
I wanted to discover APIs for my agents. I didn’t want to run maintenance scripts on the repository itself. The scripts being outdated didn’t affect my use case at all.
The real value was the metadata. Each API entry included:
- API name and description
- Authentication requirements
- HTTPS support
- CORS policy
- Rate limits
- Direct link to documentation
This is exactly what I needed to evaluate APIs for agent integration.
What I Did
I stopped worrying about the stale scripts and focused on extracting value from the curated list.
Step 1: Filter for Agent-Ready APIs
I wrote a quick filter to find APIs that would be easy to integrate into agents—no auth, HTTPS enabled, CORS-friendly:
"""Filter public-apis entries suitable for AI agent integration.Criteria: No auth required, HTTPS enabled, CORS-friendly."""
def filter_agent_ready_apis(api_entries): """ Filter APIs that are easiest to integrate into AI agents. Ideal for rapid prototyping without auth complexity. """ agent_ready = []
for api in api_entries: # Criteria for agent-ready APIs no_auth = api.get('Auth', '') in ['', 'None', 'No'] has_https = api.get('HTTPS', '').lower() == 'yes' has_cors = api.get('Cors', '').lower() in ['yes', 'unknown']
if no_auth and has_https and has_cors: agent_ready.append({ 'name': api.get('API'), 'link': api.get('Link'), 'description': api.get('Description'), 'category': api.get('Category', 'Unknown') })
return agent_readyThis immediately cut down the list to APIs I could test without setting up OAuth flows or API keys.
Step 2: Verify APIs Are Still Operational
The list being curated doesn’t mean every API is still alive. I wrote a verification script:
"""Verify public-apis entries are still operational.Run this before relying on any listed API for production use."""
import requestsimport time
def verify_api_entry(api_info): """ Verify a single API entry from public-apis repository. """ result = { 'name': api_info.get('API', 'Unknown'), 'link': api_info.get('Link', ''), 'status': 'unknown', 'response_time_ms': None, 'error': None }
if not result['link']: result['status'] = 'no_link' return result
try: start_time = time.time() response = requests.head( result['link'], timeout=10, allow_redirects=True, headers={'User-Agent': 'API-Verification-Script/1.0'} ) elapsed = (time.time() - start_time) * 1000
result['response_time_ms'] = round(elapsed, 2) result['status'] = 'operational' if response.status_code < 500 else 'error' result['http_status'] = response.status_code
except requests.exceptions.Timeout: result['status'] = 'timeout' result['error'] = 'Connection timed out' except requests.exceptions.ConnectionError: result['status'] = 'connection_failed' result['error'] = 'Could not establish connection' except Exception as e: result['status'] = 'error' result['error'] = str(e)
return result
def batch_verify_apis(api_list, delay_seconds=0.5): """ Verify multiple API entries with rate limiting. """ results = [] total = len(api_list)
for i, api_info in enumerate(api_list, 1): print(f"Verifying {i}/{total}: {api_info.get('API', 'Unknown')}") result = verify_api_entry(api_info) results.append(result)
if i < total: time.sleep(delay_seconds)
return resultsStep 3: Generate a Report
After running verification, I got a clear picture:
API Verification Report=======================Total APIs Tested: 50Operational: 42 (84.0%)Failed/Unreachable: 6 (12.0%)Unknown Status: 2
Recommendations:- Verify failed APIs manually before removing- Check auth requirements for operational APIs- Test CORS policies for browser-based usage84% operational rate for a 5-year-old list is actually impressive. The community curation is working.
Why This Matters
The distinction between “curated list” and “maintenance scripts” is crucial for open source resources:
High Value (Use This):
- Community-maintained API catalog
- Rich metadata for decision-making
- Categorized by use case
- 1,200+ contributors vetting entries
- README is actively maintained
Low Value (Ignore This):
- Automation scripts for repository maintenance
- Link checkers and validators
- 5+ years outdated
- Not essential for API discovery
For API discovery and AI agent development, the list itself is gold. The scripts are irrelevant.
Common Mistakes to Avoid
I almost made these mistakes myself:
-
Dismissing the entire repository because the scripts were outdated. This would have wasted a valuable resource.
-
Trying to fix the scripts instead of focusing on my actual goal. The scripts don’t need fixing—they’re not the point.
-
Blind trust without verification. Even curated lists need validation. Always test APIs before production use.
-
Wrong use case expectations. This isn’t a live API directory with uptime guarantees. It’s a discovery starting point.
The Lesson
Popular repositories with high star counts don’t always mean active maintenance across all components. The public-apis repository has 396,000 stars because the list is valuable, not because the scripts are cutting-edge.
When evaluating open source resources, separate the artifact from the tooling:
- Artifact: The curated knowledge (valuable, maintained)
- Tooling: The automation around it (may be stale, non-essential)
For AI agent development, I now have a workflow:
- Use the public-apis README for discovery
- Filter for agent-ready criteria
- Verify operational status
- Test CORS and auth requirements
- Integrate into agents
The 5-year-old scripts? I never even looked at them again.
Final Words + More Resources
My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me
Here are also the most important links from this article along with some further resources that will help you in this scope:
Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!
Comments