How to resolve LangExtract OpenAI setup errors when switching from Gemini
Problem
When I tried using LangExtract with OpenAI’s GPT models after setting it up with Gemini, I got these errors:
Traceback (most recent call last): File "/path/to/test.py", line 5, in <module> import langextract as lx File "/path/to/langextract/__init__.py", line 10, in <module> from .extract import extract File "/path/to/langextract/extract.py", line 15, in <module> from openai import OpenAIModuleNotFoundError: No module named 'openai'Then after installing the missing module, I got authentication errors:
openai.AuthenticationError: Error code: 401 - Incorrect API key providedAnd after fixing the API key, the output parsing failed:
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)Environment
- Python 3.11
- LangExtract 0.1.0
- OpenAI GPT-4o
- macOS 14.5
What happened?
I was using LangExtract with Gemini and it worked fine. Here’s my working Gemini setup:
import langextract as lximport os
# Set API keyos.environ['LANGEXTRACT_API_KEY'] = 'my-gemini-key'
result = lx.extract( text="Patient prescribed Amoxicillin 500mg twice daily.", prompt_description="Extract medications, dosages, and frequency", examples=[ lx.data.ExampleData( text="Take Aspirin 100mg orally once daily", extractions=[ lx.data.Extraction( extraction_class="medication", extraction_text="Aspirin", attributes={ "dosage": "100mg", "route": "orally", "frequency": "once daily" } ) ] ) ], model_id="gemini-1.5-flash")I can explain the key parts:
- Using
LANGEXTRACT_API_KEYenvironment variable - Default parameters work for Gemini
- No special installation needed
So I tried switching to OpenAI by changing the model_id:
os.environ['LANGEXTRACT_API_KEY'] = 'sk-proj-...' # OpenAI key
result = lx.extract( text="Patient prescribed Amoxicillin 500mg twice daily.", prompt_description="Extract medications, dosages, and frequency", examples=[...], # Same examples model_id="gpt-4o" # Changed to OpenAI)But when I ran this, I got the ModuleNotFoundError: No module named 'openai'.
How to solve it?
Solution #1: Install OpenAI Extra
I checked the LangExtract documentation and found OpenAI support is an optional dependency. I tried installing it:
pip install langextract[openai]Or for development installs:
pip install -e ".[openai]"This fixed the first error.
Solution #2: Fix API Key Variable
After installing the OpenAI dependency, I ran the code again and got AuthenticationError: 401.
I checked the LangExtract source code and found OpenAI uses a different environment variable. I changed from LANGEXTRACT_API_KEY to OPENAI_API_KEY:
import langextract as lximport os
# WRONG for OpenAI# os.environ['LANGEXTRACT_API_KEY'] = 'sk-proj-...'
# CORRECT for OpenAIos.environ['OPENAI_API_KEY'] = 'sk-proj-...'Solution #3: Disable Schema Constraints
After fixing the API key, the code ran but returned empty extractions or JSON parsing errors. I tried checking the actual LLM output and found OpenAI was returning unfenced JSON.
LangExtract has two important parameters for OpenAI:
result = lx.extract( text="Patient prescribed Amoxicillin 500mg twice daily.", prompt_description="Extract medications, dosages, and frequency", examples=[...], model_id="gpt-4o", api_key=os.environ['OPENAI_API_KEY'], use_schema_constraints=False, # OpenAI doesn't support this fence_output=True # Wrap JSON in markdown code blocks)The key changes:
use_schema_constraints=False: OpenAI doesn’t support structured output constraints the way Gemini doesfence_output=True: Tells the LLM to wrap JSON in markdown code blocks for reliable parsing
Now test again:
print(f"Extracted: {result.extractions}")# Output: Extracted: [Extraction(extraction_class='medication', extraction_text='Amoxicillin', attributes={'dosage': '500mg', 'frequency': 'twice daily'})]You can see that I succeeded to extract the medication data with OpenAI.
The reason
I think the key reason for these errors is that LangExtract defaults are optimized for Gemini, not OpenAI:
- Missing dependency: OpenAI support is opt-in via
[openai]extra, not installed by default - API key mismatch: OpenAI SDK looks for
OPENAI_API_KEY, notLANGEXTRACT_API_KEY - Schema constraints: Gemini supports structured output constraints via schema validation, but OpenAI doesn’t have this feature
- Output fencing: GPT models often output raw JSON without markdown code blocks, which breaks LangExtract’s parser
Here’s the API key priority matrix:
| Provider | Primary Env Var | Fallback Env Var | Notes |
|---|---|---|---|
| Gemini | GEMINI_API_KEY | LANGEXTRACT_API_KEY | Default |
| OpenAI | OPENAI_API_KEY | LANGEXTRACT_API_KEY | Requires [openai] install |
| Ollama | (none needed) | (none needed) | Local only |
The fence_output parameter is critical for OpenAI. Without it:
GPT output: {"entity": "value"} # Raw JSONWith fence_output=True:
GPT output: ```json{"entity": "value"}LangExtract's parser knows to extract JSON from the code block.
## Complete Working Example
Here's a complete example with error handling and proper configuration:
```python title="openai_complete_example.py"import langextract as lximport osfrom dotenv import load_dotenv
# Load environment variablesload_dotenv()
# Verify API key is setapi_key = os.environ.get('OPENAI_API_KEY')if not api_key: raise ValueError( "OPENAI_API_KEY not set. " "OpenAI requires specific env variable, not LANGEXTRACT_API_KEY." )
# Configure OpenAI-specific parametersresult = lx.extract( text="Patient prescribed Amoxicillin 500mg twice daily.", prompt_description="Extract medications, dosages, and frequency", examples=[ lx.data.ExampleData( text="Take Aspirin 100mg orally once daily", extractions=[ lx.data.Extraction( extraction_class="medication", extraction_text="Aspirin", attributes={ "dosage": "100mg", "route": "orally", "frequency": "once daily" } ) ] ) ], model_id="gpt-4o", api_key=api_key, fence_output=True, # Required for OpenAI use_schema_constraints=False # Not supported for OpenAI)
print(f"Extracted: {result.extractions}")Summary
In this post, I showed how to resolve LangExtract OpenAI setup errors when switching from Gemini. The key point is OpenAI requires the [openai] extra installation, OPENAI_API_KEY environment variable, and specific parameters (fence_output=True, use_schema_constraints=False) to work properly with LangExtract.
Final Words + More Resources
My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me
Here are also the most important links from this article along with some further resources that will help you in this scope:
- 👨💻 LangExtract Official Documentation
- 👨💻 OpenAI API Documentation
- 👨💻 LangExtract GitHub Issues - OpenAI Provider
Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!
Comments