Ship Your ML Model Like a Pro End-to-End CI/CD to AWS (ECR and ECS Fargate) with FastAPI & GitHub Actions

Want your ML model to go from “works on my laptop” to “serving real users behind a load balancer”? This tutorial walks you end-to-end: project layout, training a model, packaging an API, writing tests, containerizing, and setting up a production-grade CI/CD pipeline that builds, scans, pushes, and deploys to AWS.
You’ll finish with:
A working /predict API backed by your model
Automated tests + linting
A secure GitHub Actions pipeline using AWS OIDC, Amazon ECR and ECS Fargate
Zero server management (fully managed containers)
What we’re building
Flow:
Commit → GitHub Actions: test → build → scan → push image to ECR → deploy ECS task → behind ALB → /predict
live.
Stack: Python, scikit-learn, FastAPI, Docker, pytest, GitHub Actions, AWS ECR, ECS Fargate, ALB.
1) Repo structure
ml-cicd-aws/
├─ app/
│ ├─ main.py
│ ├─ model.py
│ ├─ schemas.py
│ ├─ __init__.py
├─ data/
│ └─ iris.csv # (optional; we’ll use sklearn’s dataset)
├─ models/
│ └─ model.pkl # produced by training job
├─ tests/
│ ├─ test_api.py
│ └─ test_model.py
├─ training/
│ └─ train.py
├─ Dockerfile
├─ requirements.txt
├─ runtime.txt # optional pin for build tooling
├─ .dockerignore
├─ .gitignore
├─ Makefile
└─ .github/
└─ workflows/
└─ ci-cd.yml
2) Minimal ML model (training script)
We’ll train a simple classifier (Iris) and persist it to models/model.pkl
.
training/train.py
import os
import joblib
from pathlib import Path
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
OUT_DIR = Path(__file__).resolve().parents[1] / "models"
OUT_DIR.mkdir(parents=True, exist_ok=True)
MODEL_PATH = OUT_DIR / "model.pkl"
def train():
iris = load_iris()
X, y = iris.data, iris.target
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.2, stratify=y, random_state=42
)
clf = RandomForestClassifier(n_estimators=120, random_state=42)
clf.fit(X_train, y_train)
joblib.dump({"model": clf, "target_names": iris.target_names}, MODEL_PATH)
print(f"Saved: {MODEL_PATH}")
if __name__ == "__main__":
train()
Run it locally:
python -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt
python training/train.py
3) FastAPI app to serve predictions
app/schemas.py
from pydantic import BaseModel, Field
from typing import List
class PredictRequest(BaseModel):
# For iris: sepal_length, sepal_width, petal_length, petal_width
instances: List[List[float]] = Field(..., example=[[5.1, 3.5, 1.4, 0.2]])
app/model.py
import joblib
from pathlib import Path
class ModelService:
def __init__(self, path: str = "models/model.pkl"):
p = Path(path)
if not p.exists():
raise FileNotFoundError(f"Model not found at {p.resolve()}")
blob = joblib.load(p)
self.model = blob["model"]
self.target_names = blob["target_names"]
def predict(self, X):
labels = self.model.predict(X)
return [self.target_names[i] for i in labels]
app/main.py
from fastapi import FastAPI
from app.schemas import PredictRequest
from app.model import ModelService
app = FastAPI(title="ML Iris Predictor")
svc = ModelService() # loads on startup
@app.get("/health")
def health():
return {"status": "ok"}
@app.post("/predict")
def predict(payload: PredictRequest):
preds = svc.predict(payload.instances)
return {"predictions": preds}
4) Tests
tests/test_model.py
import os
from app.model import ModelService
def test_model_loads():
assert os.path.exists("models/model.pkl"), "Run training/train.py first"
svc = ModelService()
assert svc.model is not None
tests/test_api.py
from fastapi.testclient import TestClient
from app.main import app
client = TestClient(app)
def test_health():
r = client.get("/health")
assert r.status_code == 200
assert r.json()["status"] == "ok"
def test_predict():
payload = {"instances": [[5.1, 3.5, 1.4, 0.2]]}
r = client.post("/predict", json=payload)
assert r.status_code == 200
assert "predictions" in r.json()
assert isinstance(r.json()["predictions"], list)
5) Requirements
requirements.txt
fastapi==0.115.2
uvicorn[standard]==0.30.6
scikit-learn==1.5.2
joblib==1.4.2
pydantic==2.9.2
pytest==8.3.3
6) Containerization
.dockerignore
.venv
__pycache__
*.pyc
*.pyo
.git
.gitignore
*.ipynb
data/
Dockerfile
FROM python:3.11-slim
# System deps
RUN apt-get update && apt-get install -y --no-install-recommends \
gcc build-essential && \
rm -rf /var/lib/apt/lists/*
WORKDIR /app
# Pre-copy requirements for better layer caching
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
# Copy the rest
COPY app/ app/
COPY models/ models/
# FastAPI runs on 8000
EXPOSE 8000
# Healthcheck (ECS respects container exit, ALB health checks the /health)
HEALTHCHECK --interval=30s --timeout=3s \
CMD curl -f http://localhost:8000/health || exit 1
# Start the API
CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8000"]
Build & run locally:
docker build -t ml-iris:local .
docker run -p 8000:8000 ml-iris:local
# test
curl -s http://localhost:8000/health
curl -s -X POST http://localhost:8000/predict -H "Content-Type: application/json" \
-d '{"instances": [[5.1, 3.5, 1.4, 0.2],[6.0, 2.7, 5.1, 1.6]]}'
7) Makefile (handy shortcuts)
Makefile
venv:
python -m venv .venv && . .venv/bin/activate && pip install -r requirements.txt
train:
. .venv/bin/activate || true; python training/train.py
test:
pytest -q
build:
docker build -t $(IMAGE):$(TAG) .
run:
docker run -p 8000:8000 $(IMAGE):$(TAG)
.PHONY: venv train test build run
8) AWS setup (one-time)
8.1 Create ECR repository
aws ecr create-repository --repository-name ml-iris --image-scanning-configuration scanOnPush=true
Note the ECR URI: ACCOUNT_ID.dkr.ecr.ap-south-1.amazonaws.com/ml-iris
8.2 Create ECS Cluster + Fargate Service (with ALB)
You can click-through in the console (ECS → Create cluster → Fargate), or use IaC later. For the first pass, console is fine:
Cluster:
ml-cluster
Task Definition: Fargate, CPU 0.25 vCPU, memory 0.5–1 GB, container port 8000
Load Balancer: Application Load Balancer, health check path /health
, target type IP
Desired tasks: 1 (scale later)
Keep AWSVPC networking; pick public subnets (or private with NAT).
9) GitHub Actions with OIDC (no long-lived AWS keys)
9.1 AWS IAM Role for GitHub OIDC
Create OIDC provider (usually already exists if you’ve done it before):
Provider URL:
https://token.actions.githubusercontent.com
Audience: sts.amazonaws.com
Create IAM role (trusted by that provider), attach policy to push to ECR and deploy to ECS.
Trust policy (example):
{
"Version": "2012-10-17",
"Statement": [{
"Effect": "Allow",
"Principal": { "Federated": "arn:aws:iam::<ACCOUNT_ID>:oidc-provider/token.actions.githubusercontent.com" },
"Action": "sts:AssumeRoleWithWebIdentity",
"Condition": {
"StringEquals": {
"token.actions.githubusercontent.com:aud": "sts.amazonaws.com"
},
"StringLike": {
"token.actions.githubusercontent.com:sub": "repo:<GITHUB_OWNER>/<REPO_NAME>:*"
}
}
}]
}
Permissions policy (minimal to ECR/ECS):
{
"Version": "2012-10-17",
"Statement": [
{"Effect": "Allow","Action": ["ecr:GetAuthorizationToken"],"Resource": "*"},
{"Effect": "Allow","Action": ["ecr:BatchCheckLayerAvailability","ecr:CompleteLayerUpload","ecr:UploadLayerPart","ecr:InitiateLayerUpload","ecr:PutImage","ecr:DescribeRepositories"],"Resource": "arn:aws:ecr:ap-south-1:<ACCOUNT_ID>:repository/ml-iris"},
{"Effect": "Allow","Action": ["ecs:DescribeServices","ecs:DescribeTaskDefinition","ecs:RegisterTaskDefinition","ecs:UpdateService"],"Resource": "*"},
{"Effect": "Allow","Action": ["iam:PassRole"],"Resource": "*","Condition": {"StringEquals":{"iam:PassedToService":"ecs-tasks.amazonaws.com"}}}
]
}
Save the role ARN as a GitHub Actions secret: AWS_ROLE_TO_ASSUME
.
Also add these Actions secrets:
AWS_REGION
=ap-south-1
(or your region)
ECR_REPO
= ml-iris
ECS_CLUSTER
= ml-cluster
ECS_SERVICE
= ml-iris-service
(the service you’ll create)
(Optional) IMAGE_TAG
defaults to github.sha
in workflow
10) CI/CD pipeline
.github/workflows/ci-cd.yml
name: CI/CD - ML to AWS Fargate
on:
push:
branches: [ "main" ]
workflow_dispatch:
permissions:
id-token: write # for OIDC
contents: read
env:
AWS_REGION: ${{ secrets.AWS_REGION }}
ECR_REPO: ${{ secrets.ECR_REPO }}
ECS_CLUSTER: ${{ secrets.ECS_CLUSTER }}
ECS_SERVICE: ${{ secrets.ECS_SERVICE }}
IMAGE_TAG: ${{ github.sha }}
jobs:
build-test:
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: "3.11"
- name: Install deps
run: pip install -r requirements.txt
- name: Train model (fresh artifact)
run: python training/train.py
- name: Run tests
run: pytest -q
- name: Build Docker image
run: |
IMAGE_URI=${{ secrets.AWS_ACCOUNT_ID }}.dkr.ecr.${{ env.AWS_REGION }}.amazonaws.com/${{ env.ECR_REPO }}:${{ env.IMAGE_TAG }}
docker build -t $IMAGE_URI .
- name: Trivy scan (optional but recommended)
uses: aquasecurity/trivy-action@0.28.0
with:
image-ref: ${{ secrets.AWS_ACCOUNT_ID }}.dkr.ecr.${{ env.AWS_REGION }}.amazonaws.com/${{ env.ECR_REPO }}:${{ env.IMAGE_TAG }}
format: 'table'
vuln-type: 'os,library'
exit-code: '0' # don't fail build initially; tighten later
- name: Configure AWS credentials (OIDC)
uses: aws-actions/configure-aws-credentials@v4
with:
role-to-assume: ${{ secrets.AWS_ROLE_TO_ASSUME }}
aws-region: ${{ env.AWS_REGION }}
- name: Login to ECR
id: ecr
uses: aws-actions/amazon-ecr-login@v2
- name: Push to ECR
run: |
IMAGE_URI=${{ secrets.AWS_ACCOUNT_ID }}.dkr.ecr.${{ env.AWS_REGION }}.amazonaws.com/${{ env.ECR_REPO }}:${{ env.IMAGE_TAG }}
docker push $IMAGE_URI
- name: Render task definition
id: taskdef
run: |
IMAGE_URI=${{ secrets.AWS_ACCOUNT_ID }}.dkr.ecr.${{ env.AWS_REGION }}.amazonaws.com/${{ env.ECR_REPO }}:${{ env.IMAGE_TAG }}
cat > taskdef.json << 'JSON'
{
"family": "ml-iris-td",
"networkMode": "awsvpc",
"requiresCompatibilities": ["FARGATE"],
"cpu": "256",
"memory": "512",
"executionRoleArn": "arn:aws:iam::${ACCOUNT_ID}:role/ecsTaskExecutionRole",
"taskRoleArn": "arn:aws:iam::${ACCOUNT_ID}:role/ecsTaskRole",
"containerDefinitions": [{
"name": "ml-iris",
"image": "${IMAGE_URI}",
"essential": true,
"portMappings": [{"containerPort": 8000, "protocol": "tcp"}],
"logConfiguration": {
"logDriver": "awslogs",
"options": {
"awslogs-group": "/ecs/ml-iris",
"awslogs-region": "${REGION}",
"awslogs-stream-prefix": "ecs"
}
},
"healthCheck": {
"command": ["CMD-SHELL", "curl -f http://localhost:8000/health || exit 1"],
"interval": 30,
"timeout": 5,
"retries": 3,
"startPeriod": 10
}
}]
}
JSON
sed -i "s|\${IMAGE_URI}|$IMAGE_URI|g" taskdef.json
sed -i "s|\${ACCOUNT_ID}|${{ secrets.AWS_ACCOUNT_ID }}|g" taskdef.json
sed -i "s|\${REGION}|${{ env.AWS_REGION }}|g" taskdef.json
cat taskdef.json
- name: Register new task definition
id: register
run: |
ARN=$(aws ecs register-task-definition --cli-input-json file://taskdef.json --query 'taskDefinition.taskDefinitionArn' --output text)
echo "TASK_DEF_ARN=$ARN" >> $GITHUB_OUTPUT
- name: Deploy service (rolling update)
run: |
aws ecs update-service \
--cluster "${{ env.ECS_CLUSTER }}" \
--service "${{ env.ECS_SERVICE }}" \
--task-definition "${{ steps.register.outputs.TASK_DEF_ARN }}" \
--force-new-deployment
- name: Wait for stability
run: |
aws ecs wait services-stable \
--cluster "${{ env.ECS_CLUSTER }}" \
--services "${{ env.ECS_SERVICE }}"
Replace
secrets.AWS_ACCOUNT_ID
, roles, cluster & service names to match your account.
11) Create the ECS service (once)
If you didn’t do the console wizard, you can create the service after the first task definition registers:
# Example (adjust subnets/securitygroups/ALB target group ARNs accordingly)
aws ecs create-service \
--cluster ml-cluster \
--service-name ml-iris-service \
--task-definition ml-iris-td \
--desired-count 1 \
--launch-type FARGATE \
--network-configuration "awsvpcConfiguration={subnets=[subnet-xxx,subnet-yyy],securityGroups=[sg-zzz],assignPublicIp=ENABLED}" \
--load-balancers "targetGroupArn=arn:aws:elasticloadbalancing:ap-south-1:<ACCOUNT_ID>:targetgroup/ml-tg/abc123,containerName=ml-iris,containerPort=8000"
Point your ALB listener (port 80/443) to that target group. Health check path: /health
.
12) Try it live
Once the workflow finishes, hit the ALB DNS:
curl http://<your-alb-dns>/health
curl -X POST http://<your-alb-dns}/predict -H "Content-Type: application/json" \
-d '{"instances": [[6.2, 2.8, 4.8, 1.8]]}'
You should see a species prediction (e.g., virginica
).
13) Production tips
Reproducible training: Move training into a separate job/artifact, version artifacts (
model-YYYYMMDD.pkl
) and pin them in releases.
Model registry: Store artifacts in S3 with versioning; pass S3 URL via ECS task env var.
Secrets: Use SSM Parameter Store or Secrets Manager; mount via task definition.
Observability: Enable AWS Logs Insights for /ecs/ml-iris
, add request metrics via a sidecar (e.g., Prometheus exporter) or API middleware.
Autoscaling: Configure Target Tracking on ECS service (CPU/Memory or ALB RequestCount).
Blue/Green: Use CodeDeploy for zero-downtime if you need canaries/linear rollouts.
Security: Restrict IAM to least privilege, use private subnets + NAT in production.
14) Local developer experience (optional)
docker-compose.yml
for quick spins:
version: "3.9"
services:
api:
build: .
ports:
- "8000:8000"
15) Common pitfalls & fixes
Model file missing in container
Make suremodels/model.pkl
exists before building the image (run training in CI first).
ALB health check failing
Confirm containerPort 8000, target group health check path /health
, security groups allow ALB → ECS.
Permission errors in CI
Recheck IAM role trust policy sub
matches repo:OWNER/REPO:*
, and id-token: write
is enabled.
In this tutorial, you trained a scikit-learn model, exposed it via a FastAPI endpoint, containerized it with Docker, and built a secure CI/CD pipeline using GitHub Actions’ OIDC to deploy on AWS ECS Fargate behind an Application Load Balancer. Every push to main
tests your code, scans your image, pushes to ECR, and rolls out a new task—giving you a reproducible, auditable path from experimentation to real-world traffic.
Comments (0)
No comments yet. Be the first to share your thoughts!