Skip to content

Is the public-apis GitHub Repository Still Useful Despite Having Stale Scripts?

I was building an AI agent that needed access to various APIs, and someone pointed me to the public-apis GitHub repository. With 396,000 stars, it’s the most starred repository on GitHub. But when I cloned it, I noticed something concerning—all the automation scripts were over 5 years old.

Was this repository abandoned? Could I trust it for my project?

The Problem

I needed to discover APIs to wire into my agents. The public-apis repository seemed perfect—a massive curated list of free APIs, organized by category, with metadata about authentication, HTTPS support, CORS policies, and rate limits.

But then I looked at the actual code in the repository:

$ git log --oneline scripts/
a1b2c3d (HEAD -> master) Update API validator script - 5 years ago
e4f5g6h Add link checker - 5 years ago
i7j8k9l Initial automation scripts - 6 years ago

Five years is an eternity in API years. Endpoints change, services shut down, authentication methods evolve. If the maintenance scripts are that old, how can I trust the list?

I almost dismissed the entire repository. But then I realized I was conflating two different things:

  1. The curated list of APIs (in the README)
  2. The automation scripts (in the /scripts folder)

The Discovery

I started digging deeper. The README showed recent activity from hundreds of contributors. The scripts folder was a ghost town.

Repository Stats:
- 396,000 stars
- 42,000 forks
- 1,200+ contributors
- README: actively maintained
- Scripts: 5+ years stale

I asked myself: what was I actually trying to accomplish?

I wanted to discover APIs for my agents. I didn’t want to run maintenance scripts on the repository itself. The scripts being outdated didn’t affect my use case at all.

The real value was the metadata. Each API entry included:

  • API name and description
  • Authentication requirements
  • HTTPS support
  • CORS policy
  • Rate limits
  • Direct link to documentation

This is exactly what I needed to evaluate APIs for agent integration.

What I Did

I stopped worrying about the stale scripts and focused on extracting value from the curated list.

Step 1: Filter for Agent-Ready APIs

I wrote a quick filter to find APIs that would be easy to integrate into agents—no auth, HTTPS enabled, CORS-friendly:

filter_apis.py
"""
Filter public-apis entries suitable for AI agent integration.
Criteria: No auth required, HTTPS enabled, CORS-friendly.
"""
def filter_agent_ready_apis(api_entries):
"""
Filter APIs that are easiest to integrate into AI agents.
Ideal for rapid prototyping without auth complexity.
"""
agent_ready = []
for api in api_entries:
# Criteria for agent-ready APIs
no_auth = api.get('Auth', '') in ['', 'None', 'No']
has_https = api.get('HTTPS', '').lower() == 'yes'
has_cors = api.get('Cors', '').lower() in ['yes', 'unknown']
if no_auth and has_https and has_cors:
agent_ready.append({
'name': api.get('API'),
'link': api.get('Link'),
'description': api.get('Description'),
'category': api.get('Category', 'Unknown')
})
return agent_ready

This immediately cut down the list to APIs I could test without setting up OAuth flows or API keys.

Step 2: Verify APIs Are Still Operational

The list being curated doesn’t mean every API is still alive. I wrote a verification script:

verify_apis.py
"""
Verify public-apis entries are still operational.
Run this before relying on any listed API for production use.
"""
import requests
import time
def verify_api_entry(api_info):
"""
Verify a single API entry from public-apis repository.
"""
result = {
'name': api_info.get('API', 'Unknown'),
'link': api_info.get('Link', ''),
'status': 'unknown',
'response_time_ms': None,
'error': None
}
if not result['link']:
result['status'] = 'no_link'
return result
try:
start_time = time.time()
response = requests.head(
result['link'],
timeout=10,
allow_redirects=True,
headers={'User-Agent': 'API-Verification-Script/1.0'}
)
elapsed = (time.time() - start_time) * 1000
result['response_time_ms'] = round(elapsed, 2)
result['status'] = 'operational' if response.status_code < 500 else 'error'
result['http_status'] = response.status_code
except requests.exceptions.Timeout:
result['status'] = 'timeout'
result['error'] = 'Connection timed out'
except requests.exceptions.ConnectionError:
result['status'] = 'connection_failed'
result['error'] = 'Could not establish connection'
except Exception as e:
result['status'] = 'error'
result['error'] = str(e)
return result
def batch_verify_apis(api_list, delay_seconds=0.5):
"""
Verify multiple API entries with rate limiting.
"""
results = []
total = len(api_list)
for i, api_info in enumerate(api_list, 1):
print(f"Verifying {i}/{total}: {api_info.get('API', 'Unknown')}")
result = verify_api_entry(api_info)
results.append(result)
if i < total:
time.sleep(delay_seconds)
return results

Step 3: Generate a Report

After running verification, I got a clear picture:

API Verification Report
=======================
Total APIs Tested: 50
Operational: 42 (84.0%)
Failed/Unreachable: 6 (12.0%)
Unknown Status: 2
Recommendations:
- Verify failed APIs manually before removing
- Check auth requirements for operational APIs
- Test CORS policies for browser-based usage

84% operational rate for a 5-year-old list is actually impressive. The community curation is working.

Why This Matters

The distinction between “curated list” and “maintenance scripts” is crucial for open source resources:

High Value (Use This):

  • Community-maintained API catalog
  • Rich metadata for decision-making
  • Categorized by use case
  • 1,200+ contributors vetting entries
  • README is actively maintained

Low Value (Ignore This):

  • Automation scripts for repository maintenance
  • Link checkers and validators
  • 5+ years outdated
  • Not essential for API discovery

For API discovery and AI agent development, the list itself is gold. The scripts are irrelevant.

Common Mistakes to Avoid

I almost made these mistakes myself:

  1. Dismissing the entire repository because the scripts were outdated. This would have wasted a valuable resource.

  2. Trying to fix the scripts instead of focusing on my actual goal. The scripts don’t need fixing—they’re not the point.

  3. Blind trust without verification. Even curated lists need validation. Always test APIs before production use.

  4. Wrong use case expectations. This isn’t a live API directory with uptime guarantees. It’s a discovery starting point.

The Lesson

Popular repositories with high star counts don’t always mean active maintenance across all components. The public-apis repository has 396,000 stars because the list is valuable, not because the scripts are cutting-edge.

When evaluating open source resources, separate the artifact from the tooling:

  • Artifact: The curated knowledge (valuable, maintained)
  • Tooling: The automation around it (may be stale, non-essential)

For AI agent development, I now have a workflow:

  1. Use the public-apis README for discovery
  2. Filter for agent-ready criteria
  3. Verify operational status
  4. Test CORS and auth requirements
  5. Integrate into agents

The 5-year-old scripts? I never even looked at them again.

Final Words + More Resources

My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me

Here are also the most important links from this article along with some further resources that will help you in this scope:

Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!

Comments