Skip to content

Deploying ML Models with FastAPI

Overview FastAPI is the go-to framework for serving ML models. It’s fast, has automatic documentation, and handles async requests efficiently. Basic Setup pip install fastapi uvicorn pydantic from fastapi import FastAPI from pydantic import BaseModel import pickle app = FastAPI(title="ML Model API") # Load model at startup with open("model.pkl", "rb") as f: model = pickle.load(f) class PredictionInput(BaseModel): features: list[float] class PredictionOutput(BaseModel): prediction: float confidence: float @app.post("/predict", response_model=PredictionOutput) async def predict(input: PredictionInput): pred = model.predict([input.features])[0] proba = model.predict_proba([input.features])[0].max() return PredictionOutput(prediction=pred, confidence=proba) Running the Server uvicorn main:app --host 0.0.0.0 --port 8000 --reload Health Check Endpoint @app.get("/health") async def health(): return {"status": "healthy", "model_loaded": model is not None} Batch Predictions class BatchInput(BaseModel): instances: list[list[float]] @app.post("/predict/batch") async def predict_batch(input: BatchInput): predictions = model.predict(input.instances) return {"predictions": predictions.tolist()} Docker Deployment FROM python:3.11-slim WORKDIR /app COPY requirements.txt . RUN pip install --no-cache-dir -r requirements.txt COPY . . EXPOSE 8000 CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"] docker build -t ml-api . docker run -p 8000:8000 ml-api Production Tips Use Gunicorn: gunicorn main:app -w 4 -k uvicorn.workers.UvicornWorker Add rate limiting: Use slowapi or nginx Monitor latency: Add Prometheus metrics Cache predictions: Use Redis for repeated inputs Validate inputs: Pydantic handles this automatically Key Resources FastAPI Documentation Uvicorn

December 1, 2025 · 1 min · 199 words · BlogIA Team

MLOps with MLflow

Overview MLflow is the most popular open-source platform for managing the ML lifecycle. It handles experiment tracking, model packaging, and deployment. Installation pip install mlflow mlflow ui # Start tracking server at localhost:5000 Experiment Tracking import mlflow mlflow.set_experiment("my-classification") with mlflow.start_run(): # Log parameters mlflow.log_param("learning_rate", 0.01) mlflow.log_param("epochs", 100) # Train model... # Log metrics mlflow.log_metric("accuracy", 0.95) mlflow.log_metric("f1_score", 0.93) # Log model mlflow.sklearn.log_model(model, "model") # Log artifacts mlflow.log_artifact("confusion_matrix.png") Autologging import mlflow.sklearn mlflow.sklearn.autolog() # All sklearn metrics logged automatically model = RandomForestClassifier() model.fit(X_train, y_train) Model Registry # Register a model mlflow.register_model("runs:/abc123/model", "ProductionClassifier") # Load from registry model = mlflow.pyfunc.load_model("models:/ProductionClassifier/Production") Model Serving # Serve model as REST API mlflow models serve -m "models:/ProductionClassifier/1" -p 5001 # Query the API curl -X POST http://localhost:5001/invocations \ -H "Content-Type: application/json" \ -d '{"inputs": [[1.0, 2.0, 3.0, 4.0]]}' Project Structure my_project/ ├── MLproject # Project definition ├── conda.yaml # Environment ├── train.py # Training script └── data/ # MLproject name: my_project conda_env: conda.yaml entry_points: main: parameters: learning_rate: {type: float, default: 0.01} command: "python train.py --lr {learning_rate}" Best Practices Use experiments: Group related runs Tag runs: Add metadata for filtering Version data: Log dataset hashes Automate: Use autolog when possible Compare: Use MLflow UI to compare runs Key Resources MLflow Documentation MLflow GitHub

December 1, 2025 · 1 min · 204 words · BlogIA Team

Git Best Practices for ML Projects

Overview ML projects have unique version control challenges: large files, experiment tracking, and model versioning. This guide covers Git best practices for ML. .gitignore for ML # Data data/ *.csv *.parquet *.json !config.json # Models *.pt *.pth *.onnx *.pkl models/ # Checkpoints checkpoints/ *.ckpt # Logs logs/ wandb/ mlruns/ # Environment .venv/ __pycache__/ *.pyc # Notebooks .ipynb_checkpoints/ # IDE .vscode/ .idea/ Git LFS for Large Files # Install Git LFS git lfs install # Track large files git lfs track "*.pt" git lfs track "*.onnx" git lfs track "data/*.parquet" # Commit .gitattributes git add .gitattributes git commit -m "Configure Git LFS" DVC for Data Versioning # Install DVC pip install dvc # Initialize dvc init # Track data dvc add data/training.csv # Push to remote storage dvc remote add -d storage s3://my-bucket/dvc dvc push # Pull data dvc pull Branching Strategy main ├── develop │ ├── feature/new-model │ ├── feature/data-pipeline │ └── experiment/bert-large └── release/v1.0 Commit Messages # Format: type(scope): description feat(model): add BERT classifier fix(data): handle missing values in preprocessing exp(training): test learning rate 1e-4 docs(readme): add installation instructions refactor(pipeline): simplify data loading Experiment Tracking with Git # Create experiment branch git checkout -b exp/lr-sweep-001 # Run experiment python train.py --lr 0.001 # Commit results git add results/ git commit -m "exp: lr=0.001, acc=0.92" # Tag successful experiments git tag -a exp-lr001-acc92 -m "Best LR experiment" Pre-commit Hooks # .pre-commit-config.yaml repos: - repo: https://github.com/astral-sh/ruff-pre-commit rev: v0.1.6 hooks: - id: ruff args: [--fix] - id: ruff-format - repo: https://github.com/pre-commit/pre-commit-hooks rev: v4.5.0 hooks: - id: check-yaml - id: end-of-file-fixer - id: trailing-whitespace - id: check-added-large-files args: ['--maxkb=1000'] pip install pre-commit pre-commit install Best Practices Never commit data: Use DVC or Git LFS Never commit secrets: Use environment variables Small commits: One logical change per commit Meaningful messages: Describe what and why Branch per experiment: Easy to compare and revert Key Resources DVC Documentation Git LFS Pre-commit

December 1, 2025 · 2 min · 315 words · BlogIA Team

Containerizing ML Applications with Docker

Overview Docker containers ensure your ML application runs identically everywhere. This guide covers containerization best practices for ML workloads. Basic Dockerfile FROM python:3.11-slim WORKDIR /app # Install dependencies first (cached layer) COPY requirements.txt . RUN pip install --no-cache-dir -r requirements.txt # Copy application code COPY . . EXPOSE 8000 CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"] Multi-Stage Build Reduce image size by separating build and runtime: # Build stage FROM python:3.11 AS builder WORKDIR /app COPY requirements.txt . RUN pip install --user --no-cache-dir -r requirements.txt # Runtime stage FROM python:3.11-slim WORKDIR /app COPY --from=builder /root/.local /root/.local COPY . . ENV PATH=/root/.local/bin:$PATH CMD ["python", "main.py"] GPU Support FROM nvidia/cuda:12.1-runtime-ubuntu22.04 RUN apt-get update && apt-get install -y python3 python3-pip WORKDIR /app COPY requirements.txt . RUN pip3 install --no-cache-dir -r requirements.txt COPY . . CMD ["python3", "train.py"] Run with GPU: ...

December 1, 2025 · 2 min · 295 words · BlogIA Team