Dockerization & Containerization
Even if you successfully trained an ML model in Python 3.10 and built a fast API, deploying it to a Cloud Server (EC2) that happens to be running Python 3.8 and an old version of scikit-learn will instantly crash the application.
1. What is Docker?
Docker is an industry standard tool that solves the "It worked on my machine" problem. Instead of blindly deploying files to a server, you package your API, your Model .pkl, the exact Python version, and every single library dependency into an isolated, lightweight Virtual Machine called a Container.
Because a Docker container holds its own isolated Operating System logic, it will run exactly identically on your Windows laptop, on an AWS Linux server, or on a Kubernetes cluster.
2. The Dockerfile
A Dockerfile is a simple plaintext script instructing Docker how to build your container step-by-step from scratch.
Standard MLOps Flow:
1. Tell Docker to download a barebones version of python:3.10-slim.
2. Tell Docker to create an /app directory inside the container.
3. Tell Docker to copy requirements.txt from your laptop into the container.
4. Tell Docker to run pip install -r requirements.txt.
5. Tell Docker to copy the massive xgboost_model.pkl and your main.py API server code.
6. Tell Docker to expose Port 8000 so the outside world can interact with your API.
7. Tell Docker to execute uvicorn main:app to start the server.
Once constructed, you push this "Image" to a cloud repository (like AWS ECR). Your production server simply pulls the Image and runs it.
How to execute the examples:
Go to the Examples/ folder and review the file:
Dockerfile_Example