How to Convert Jupyter Notebooks to Production-Ready Python Code
The Problem
I built a machine learning model in Jupyter Notebook. It worked perfectly. Then my manager said: “Great, deploy it to production by Friday.”
I stared at my notebook. Twenty-three cells. Global variables everywhere. Hardcoded file paths. No error handling. How do I turn this into something that runs reliably in production?
# My "working" prototypeimport pandas as pddf = pd.read_csv('/Users/me/Desktop/project/data.csv') # Hardcoded pathdf = df.dropna() # Silent data lossX = df[['feature1', 'feature2']]y = df['target']This is the gap between data science and engineering. Notebooks excel at exploration, but production demands reproducibility, testing, and scalability. Here’s how I bridged that gap.
First Attempt: Manual Copy-Paste
I tried copying code cell by cell into a Python file.
import pandas as pdfrom sklearn.ensemble import RandomForestClassifier
df = pd.read_csv('/Users/me/Desktop/project/data.csv')df = df.dropna()X = df[['feature1', 'feature2']]y = df['target']model = RandomForestClassifier()model.fit(X, y)Problems showed up immediately:
- The hardcoded path failed on the server
- No logging to track what happened
- No error handling when data was missing
- No way to run with different parameters
I needed a better approach.
Step 1: Convert with nbconvert
nbconvert is Jupyter’s built-in conversion tool. It extracts code from notebooks into Python scripts.
# Convert single notebookjupyter nbconvert --to script prototype.ipynb
# Clear outputs before conversion (cleaner result)jupyter nbconvert --ClearOutputPreprocessor.enabled=True --inplace prototype.ipynbjupyter nbconvert --to script prototype.ipynbThis gave me a .py file with all my code. But it was still messy - comments mixed with code, no structure.
For automation, I switched to programmatic conversion with nbclient:
import nbformatfrom nbclient import NotebookClientfrom nbclient.exceptions import CellExecutionError
# Load the notebookwith open('prototype.ipynb', 'r') as f: nb = nbformat.read(f, as_version=4)
# Execute notebook programmaticallyclient = NotebookClient( nb, timeout=600, kernel_name='python3', resources={'metadata': {'path': 'notebooks/'}})
try: client.execute()except CellExecutionError as e: print(f'Error executing notebook: {e}') raisefinally: nbformat.write(nb, 'executed_notebook.ipynb')This approach lets me run notebooks in pipelines and catch errors programmatically.
Step 2: Parameterize with Papermill
My notebook had hardcoded parameters scattered everywhere. I needed to pass different values for different runs.
Papermill solves this. First, I added a “parameters” cell to my notebook:
# Tag this cell as "parameters" in Jupyteralpha = 0.5l1_ratio = 0.1n_estimators = 100data_path = "data/default.csv"Then I could execute with different parameters:
# Execute with custom parameterspapermill input.ipynb output.ipynb \ -p alpha 0.6 \ -p l1_ratio 0.1 \ -p data_path "data/production.csv"Or programmatically:
import papermill as pm
pm.execute_notebook( 'templates/model_training.ipynb', 'outputs/training_run_001.ipynb', parameters=dict( alpha=0.6, l1_ratio=0.1, data_path='data/production.csv' ))Now I can run the same notebook with different configurations for dev, staging, and production.
Step 3: Refactor for Production Quality
The converted script still had notebook-style code. I needed proper structure.
Before (Notebook Style)
# Messy notebook-style codeimport pandas as pddf = pd.read_csv('data.csv')df = df.dropna()X = df[['feature1', 'feature2']]y = df['target']from sklearn.ensemble import RandomForestClassifiermodel = RandomForestClassifier()model.fit(X, y)After (Production-Ready)
import pandas as pdfrom typing import Tupleimport logging
logger = logging.getLogger(__name__)
def load_and_preprocess_data( filepath: str, features: list[str], target: str) -> Tuple[pd.DataFrame, pd.Series]: """ Load and preprocess data for model training.
Args: filepath: Path to the CSV data file features: List of feature column names target: Target column name
Returns: Tuple of (features DataFrame, target Series)
Raises: FileNotFoundError: If data file doesn't exist ValueError: If required columns are missing """ try: df = pd.read_csv(filepath) except FileNotFoundError: logger.error(f"Data file not found: {filepath}") raise
required_columns = features + [target] missing = set(required_columns) - set(df.columns) if missing: raise ValueError(f"Missing columns: {missing}")
df = df.dropna(subset=required_columns) X = df[features] y = df[target]
logger.info(f"Loaded {len(df)} samples with {len(features)} features") return X, yfrom sklearn.ensemble import RandomForestClassifierfrom sklearn.model_selection import cross_val_scoreimport numpy as npimport logging
logger = logging.getLogger(__name__)
class ModelTrainer: def __init__(self, n_estimators: int = 100, random_state: int = 42): self.n_estimators = n_estimators self.random_state = random_state self.model = None
def train(self, X, y) -> dict: """Train model and return metrics.""" self.model = RandomForestClassifier( n_estimators=self.n_estimators, random_state=self.random_state )
cv_scores = cross_val_score(self.model, X, y, cv=5) self.model.fit(X, y)
metrics = { 'cv_mean': np.mean(cv_scores), 'cv_std': np.std(cv_scores) }
logger.info(f"Model trained. CV accuracy: {metrics['cv_mean']:.3f} (+/- {metrics['cv_std']:.3f})") return metrics
def predict(self, X): """Make predictions.""" if self.model is None: raise RuntimeError("Model not trained. Call train() first.") return self.model.predict(X)Key changes I made:
- Type hints for IDE support and error catching
- Logging instead of print statements
- Error handling with specific exceptions
- Docstrings for documentation
- Classes to encapsulate state
Step 4: Deploy with MLflow
For model versioning and deployment, I used MLflow. It tracks experiments, versions models, and serves predictions.
import mlflowfrom mlflow.tracking import MlflowClientimport logging
logger = logging.getLogger(__name__)
def train_and_log_model(X, y, model_name: str = "production_model"): """Train model and log to MLflow."""
with mlflow.start_run(): # Log parameters mlflow.log_param("n_estimators", 100) mlflow.log_param("random_state", 42)
# Train model trainer = ModelTrainer(n_estimators=100) metrics = trainer.train(X, y)
# Log metrics mlflow.log_metric("cv_accuracy", metrics['cv_mean']) mlflow.log_metric("cv_std", metrics['cv_std'])
# Log and register model mlflow.sklearn.log_model( trainer.model, "model", registered_model_name=model_name )
run_id = mlflow.active_run().info.run_id logger.info(f"Model logged with run_id: {run_id}")
return run_idPromoting to production:
def promote_to_production(model_name: str, version: int): """Transition model version to Production stage."""
client = MlflowClient()
client.transition_model_version_stage( name=model_name, version=version, stage="Production" )
logger.info(f"Model {model_name} v{version} promoted to Production")Serving the model:
# Local servingmlflow models serve \ -m "models:/production_model/Production" \ --host 0.0.0.0 --port 5000Making predictions via REST API:
import requests
url = "http://127.0.0.1:5000/invocations"data = { "dataframe_split": { "columns": ["feature1", "feature2"], "data": [[5.1, 3.5]] }}
response = requests.post(url, json=data)predictions = response.json()print(f"Predictions: {predictions}")The Complete Workflow
I now follow this pipeline:
Jupyter Notebook (prototype) | v nbconvert --to script | v Refactor into modules | v Add papermill parameters | v Unit tests + CI/CD | v MLflow Model Registry | v Production Serve (REST API)What I Learned
-
Convert first, refactor second - nbconvert gives you a starting point, but you still need to restructure the code.
-
Parameterize early - Papermill lets you run the same notebook with different configs without code changes.
-
Structure matters - Separate data loading, processing, and model logic into distinct modules.
-
Track everything - MLflow tracks parameters, metrics, and model versions. This saved me when I needed to reproduce a result from three months ago.
-
Test before deploying - I write unit tests for each module before the code goes anywhere near production.
Summary
In this post, I showed how to transition Jupyter Notebook prototypes to production-ready Python code. The process involves: converting with nbconvert, parameterizing with papermill, refactoring into modular components, and deploying with MLflow. This workflow bridges the gap between exploration and production, giving you both flexibility and reliability.
Final Words + More Resources
My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me
Here are also the most important links from this article along with some further resources that will help you in this scope:
- 👨💻 nbconvert Documentation
- 👨💻 Papermill Documentation
- 👨💻 MLflow Model Registry
- 👨💻 nbclient GitHub
Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!
Comments