How to resolve LangExtract Ollama timeout and connection errors
Problem
When I tried to use LangExtract with Ollama for local LLM inference, I got these errors:
Error 1 - Connection refused:
requests.exceptions.ConnectionError: HTTPConnectionPool(host='localhost', port=11434):Max retries exceeded with url: /api/generate (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f8a4c3d9e80>:Failed to establish a new connection: [Errno 111] Connection refused'))Error 2 - Timeout with large models:
requests.exceptions.ReadTimeout: HTTPConnectionPool(host='localhost', port=11434):Read timed out. (read timeout=120)Error 3 - Invalid JSON output:
json.JSONDecodeError: Expecting value: line 1 column 1 (char 0)These errors occurred with different models and configurations, making it hard to debug.
Environment
- Python 3.11
- LangExtract 0.5.0
- Ollama 0.1.24
- macOS 14.5
What happened?
I wanted to use LangExtract with local Ollama models to extract structured data from text. I started with a basic setup:
import langextract as lx
result = lx.extract( text="Apple Inc. was founded by Steve Jobs in 1976.", prompt_description="Extract company entities", examples=[ { "text": "Microsoft was founded by Bill Gates.", "entities": [{"company": "Microsoft", "founder": "Bill Gates"}] } ], model_id="gemma2:2b", model_url="http://localhost:11434")print(result)I ran this with Ollama running in one terminal:
# Terminal 1ollama serveAnd my Python script in another:
# Terminal 2python extract.pyBut I got the connection refused error. I realized Ollama wasn’t actually running. I fixed that by checking the service status first.
Then I tried with a larger model and got timeout errors:
result = lx.extract( text="Long text with many paragraphs...", prompt_description="Extract all entities", examples=[...], model_id="llama3.1:70b", # 70 billion parameters model_url="http://localhost:11434", # timeout is 120 seconds by default)The script timed out after 120 seconds. Large models need more time.
Then I tried with gpt-oss:20b and got JSON parse errors. The model output wasn’t valid JSON.
How to solve it?
I tackled each error separately.
Solution 1: Test Ollama connection first
Before running LangExtract, I verified Ollama is running:
import requests
def test_ollama_connection(url="http://localhost:11434"): try: response = requests.get(f"{url}/api/tags", timeout=5) if response.status_code == 200: print("✓ Ollama is running") models = response.json().get("models", []) print(f"Available models: {[m['name'] for m in models]}") return True except requests.exceptions.ConnectionError: print("✗ Ollama is not running. Start it with: ollama serve") except Exception as e: print(f"✗ Error: {e}") return False
test_ollama_connection()I run this before extraction to catch connection issues early.
Solution 2: Increase timeout for large models
I added the timeout parameter based on model size:
import langextract as lx
# Timeout guidelines based on my testing:# 2B-8B models: 60-120 seconds# 13B-30B models: 120-180 seconds# 70B models: 300-600 seconds
result = lx.extract( text="your text here...", prompt_description="Extract entities", examples=[...], model_id="llama3.1:70b", model_url="http://localhost:11434", timeout=300, # 5 minutes for 70B model)I created a simple helper function:
def get_timeout_for_model(model_id: str) -> int: """Return recommended timeout in seconds based on model size.""" if "70b" in model_id.lower(): return 300 if "30b" in model_id.lower() or "13b" in model_id.lower(): return 180 return 120 # default for smaller modelsSolution 3: Use fenced output for non-schema models
Some models like gpt-oss:20b don’t follow JSON schema well. I use fence_output=True to wrap output in markdown code blocks:
result = lx.extract( text="your text here...", prompt_description="Extract entities", examples=[...], model_id="gpt-oss:20b", model_url="http://localhost:11434", timeout=180, fence_output=True, # Force markdown code fence use_schema_constraints=False # Don't enforce JSON schema)This forces the model to output in markdown code blocks, which LangExtract can parse more reliably.
Docker setup for consistent environment
I use Docker Compose to run Ollama consistently:
version: '3.8'services: ollama: image: ollama/ollama:latest ports: - "11434:11434" volumes: - ollama_data:/root/.ollama environment: - OLLAMA_HOST=0.0.0.0
volumes: ollama_data:Then start Ollama:
docker-compose up -ddocker exec -it <container-id> ollama pull llama3.1:70bThe reason
I think these errors occur for specific reasons:
Connection refused: Ollama service isn’t running. The default port 11434 needs the ollama serve command or Docker container to be active.
Timeout: Default timeout is 120 seconds. Large models like llama3.1:70b need 5+ minutes on consumer hardware. The request times out before the model finishes inference.
Invalid JSON: Some models (like gpt-oss:20b) aren’t fine-tuned for structured output. They may add extra text, miss quotes, or produce malformed JSON. Using fence_output=True helps by asking for markdown-wrapped output instead.
Here’s a quick troubleshooting flow:
┌─────────────────────┐│ LangExtract errors? │└──────────┬──────────┘ │ ▼ ┌──────────────┐ │ Can you curl │ │ localhost: │ │ 11434? │ └──────┬───────┘ │ No ───┴─── Yes │ │ ▼ ▼┌─────────┐ ┌────────────┐│Start │ │Which error?││Ollama │ └──────┬─────┘└─────────┘ │ ┌────────┼────────┐ ▼ ▼ ▼ ┌──────┐ ┌────────┐ ┌──────┐ │Timeout│ │Invalid│ │Other │ └───┬───┘ └───┬────┘ └───┬──┘ │ │ │ ▼ ▼ ▼ Increase fence_output Check timeout =True logsSummary
In this post, I solved common LangExtract Ollama provider errors. The key points are:
- Test Ollama connection before running extraction
- Increase timeout for large models (70B needs 5+ minutes)
- Use
fence_output=Truefor models that don’t follow JSON schema well - Consider Docker for consistent Ollama environment
These fixes resolved the errors I encountered with gemma2:2b, llama3.1:70b, and gpt-oss:20b models.
Final Words + More Resources
My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me
Here are also the most important links from this article along with some further resources that will help you in this scope:
Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!
Comments