Ship Your ML Model Like a Pro End-to-End CI/CD to AWS (ECR and ECS Fargate) with FastAPI & GitHub Actions

Want your ML model to go from “works on my laptop” to “serving real users behind a load balancer”? This tutorial walks you end-to-end: project layout, training a model, packaging an API, writing tests, containerizing, and setting up a production-grade CI/CD pipeline that builds, scans, pushes, and deploys to AWS.

You’ll finish with:

A working /predict API backed by your model

Automated tests + linting

A secure GitHub Actions pipeline using AWS OIDC, Amazon ECR and ECS Fargate

Zero server management (fully managed containers)

What we’re building

Flow:
Commit → GitHub Actions: test → build → scan → push image to ECR → deploy ECS task → behind ALB → /predict live.

Stack: Python, scikit-learn, FastAPI, Docker, pytest, GitHub Actions, AWS ECR, ECS Fargate, ALB.

1) Repo structure

ml-cicd-aws/
├─ app/
│  ├─ main.py
│  ├─ model.py
│  ├─ schemas.py
│  ├─ __init__.py
├─ data/
│  └─ iris.csv              # (optional; we’ll use sklearn’s dataset)
├─ models/
│  └─ model.pkl             # produced by training job
├─ tests/
│  ├─ test_api.py
│  └─ test_model.py
├─ training/
│  └─ train.py
├─ Dockerfile
├─ requirements.txt
├─ runtime.txt              # optional pin for build tooling
├─ .dockerignore
├─ .gitignore
├─ Makefile
└─ .github/
   └─ workflows/
      └─ ci-cd.yml

2) Minimal ML model (training script)

We’ll train a simple classifier (Iris) and persist it to models/model.pkl.

training/train.py

import os
import joblib
from pathlib import Path
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier

OUT_DIR = Path(__file__).resolve().parents[1] / "models"
OUT_DIR.mkdir(parents=True, exist_ok=True)
MODEL_PATH = OUT_DIR / "model.pkl"

def train():
    iris = load_iris()
    X, y = iris.data, iris.target
    X_train, X_test, y_train, y_test = train_test_split(
        X, y, test_size=0.2, stratify=y, random_state=42
    )
    clf = RandomForestClassifier(n_estimators=120, random_state=42)
    clf.fit(X_train, y_train)
    joblib.dump({"model": clf, "target_names": iris.target_names}, MODEL_PATH)
    print(f"Saved: {MODEL_PATH}")

if __name__ == "__main__":
    train()

Run it locally:

python -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt
python training/train.py

3) FastAPI app to serve predictions

app/schemas.py

from pydantic import BaseModel, Field
from typing import List

class PredictRequest(BaseModel):
    # For iris: sepal_length, sepal_width, petal_length, petal_width
    instances: List[List[float]] = Field(..., example=[[5.1, 3.5, 1.4, 0.2]])

app/model.py

import joblib
from pathlib import Path

class ModelService:
    def __init__(self, path: str = "models/model.pkl"):
        p = Path(path)
        if not p.exists():
            raise FileNotFoundError(f"Model not found at {p.resolve()}")
        blob = joblib.load(p)
        self.model = blob["model"]
        self.target_names = blob["target_names"]

    def predict(self, X):
        labels = self.model.predict(X)
        return [self.target_names[i] for i in labels]

app/main.py

from fastapi import FastAPI
from app.schemas import PredictRequest
from app.model import ModelService

app = FastAPI(title="ML Iris Predictor")
svc = ModelService()  # loads on startup

@app.get("/health")
def health():
    return {"status": "ok"}

@app.post("/predict")
def predict(payload: PredictRequest):
    preds = svc.predict(payload.instances)
    return {"predictions": preds}

4) Tests

tests/test_model.py

import os
from app.model import ModelService

def test_model_loads():
    assert os.path.exists("models/model.pkl"), "Run training/train.py first"
    svc = ModelService()
    assert svc.model is not None

tests/test_api.py

from fastapi.testclient import TestClient
from app.main import app

client = TestClient(app)

def test_health():
    r = client.get("/health")
    assert r.status_code == 200
    assert r.json()["status"] == "ok"

def test_predict():
    payload = {"instances": [[5.1, 3.5, 1.4, 0.2]]}
    r = client.post("/predict", json=payload)
    assert r.status_code == 200
    assert "predictions" in r.json()
    assert isinstance(r.json()["predictions"], list)

5) Requirements

requirements.txt

fastapi==0.115.2
uvicorn[standard]==0.30.6
scikit-learn==1.5.2
joblib==1.4.2
pydantic==2.9.2
pytest==8.3.3

6) Containerization

.dockerignore

.venv
__pycache__
*.pyc
*.pyo
.git
.gitignore
*.ipynb
data/

Dockerfile

FROM python:3.11-slim

# System deps
RUN apt-get update && apt-get install -y --no-install-recommends \
    gcc build-essential && \
    rm -rf /var/lib/apt/lists/*

WORKDIR /app

# Pre-copy requirements for better layer caching
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

# Copy the rest
COPY app/ app/
COPY models/ models/

# FastAPI runs on 8000
EXPOSE 8000

# Healthcheck (ECS respects container exit, ALB health checks the /health)
HEALTHCHECK --interval=30s --timeout=3s \
 CMD curl -f http://localhost:8000/health || exit 1

# Start the API
CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8000"]

Build & run locally:

docker build -t ml-iris:local .
docker run -p 8000:8000 ml-iris:local
# test
curl -s http://localhost:8000/health
curl -s -X POST http://localhost:8000/predict -H "Content-Type: application/json" \
  -d '{"instances": [[5.1, 3.5, 1.4, 0.2],[6.0, 2.7, 5.1, 1.6]]}'

7) Makefile (handy shortcuts)

Makefile

venv:
	python -m venv .venv && . .venv/bin/activate && pip install -r requirements.txt

train:
	. .venv/bin/activate || true; python training/train.py

test:
	pytest -q

build:
	docker build -t $(IMAGE):$(TAG) .

run:
	docker run -p 8000:8000 $(IMAGE):$(TAG)

.PHONY: venv train test build run

8) AWS setup (one-time)

8.1 Create ECR repository

aws ecr create-repository --repository-name ml-iris --image-scanning-configuration scanOnPush=true

Note the ECR URI: ACCOUNT_ID.dkr.ecr.ap-south-1.amazonaws.com/ml-iris

8.2 Create ECS Cluster + Fargate Service (with ALB)

You can click-through in the console (ECS → Create cluster → Fargate), or use IaC later. For the first pass, console is fine:

Cluster: ml-cluster

Task Definition: Fargate, CPU 0.25 vCPU, memory 0.5–1 GB, container port 8000

Load Balancer: Application Load Balancer, health check path /health, target type IP

Desired tasks: 1 (scale later)

Keep AWSVPC networking; pick public subnets (or private with NAT).

9) GitHub Actions with OIDC (no long-lived AWS keys)

9.1 AWS IAM Role for GitHub OIDC

Create OIDC provider (usually already exists if you’ve done it before):

Provider URL: https://token.actions.githubusercontent.com

Audience: sts.amazonaws.com

Create IAM role (trusted by that provider), attach policy to push to ECR and deploy to ECS.

Trust policy (example):

{
  "Version": "2012-10-17",
  "Statement": [{
    "Effect": "Allow",
    "Principal": { "Federated": "arn:aws:iam::<ACCOUNT_ID>:oidc-provider/token.actions.githubusercontent.com" },
    "Action": "sts:AssumeRoleWithWebIdentity",
    "Condition": {
      "StringEquals": {
        "token.actions.githubusercontent.com:aud": "sts.amazonaws.com"
      },
      "StringLike": {
        "token.actions.githubusercontent.com:sub": "repo:<GITHUB_OWNER>/<REPO_NAME>:*"
      }
    }
  }]
}

Permissions policy (minimal to ECR/ECS):

{
  "Version": "2012-10-17",
  "Statement": [
    {"Effect": "Allow","Action": ["ecr:GetAuthorizationToken"],"Resource": "*"},
    {"Effect": "Allow","Action": ["ecr:BatchCheckLayerAvailability","ecr:CompleteLayerUpload","ecr:UploadLayerPart","ecr:InitiateLayerUpload","ecr:PutImage","ecr:DescribeRepositories"],"Resource": "arn:aws:ecr:ap-south-1:<ACCOUNT_ID>:repository/ml-iris"},
    {"Effect": "Allow","Action": ["ecs:DescribeServices","ecs:DescribeTaskDefinition","ecs:RegisterTaskDefinition","ecs:UpdateService"],"Resource": "*"},
    {"Effect": "Allow","Action": ["iam:PassRole"],"Resource": "*","Condition": {"StringEquals":{"iam:PassedToService":"ecs-tasks.amazonaws.com"}}}
  ]
}

Save the role ARN as a GitHub Actions secret: AWS_ROLE_TO_ASSUME.

Also add these Actions secrets:

AWS_REGION = ap-south-1 (or your region)

ECR_REPO = ml-iris

ECS_CLUSTER= ml-cluster

ECS_SERVICE= ml-iris-service (the service you’ll create)

(Optional) IMAGE_TAG defaults to github.sha in workflow

10) CI/CD pipeline

.github/workflows/ci-cd.yml

name: CI/CD - ML to AWS Fargate

on:
  push:
    branches: [ "main" ]
  workflow_dispatch:

permissions:
  id-token: write   # for OIDC
  contents: read

env:
  AWS_REGION: ${{ secrets.AWS_REGION }}
  ECR_REPO: ${{ secrets.ECR_REPO }}
  ECS_CLUSTER: ${{ secrets.ECS_CLUSTER }}
  ECS_SERVICE: ${{ secrets.ECS_SERVICE }}
  IMAGE_TAG: ${{ github.sha }}

jobs:
  build-test:
    runs-on: ubuntu-latest
    steps:
      - name: Checkout
        uses: actions/checkout@v4

      - name: Set up Python
        uses: actions/setup-python@v5
        with:
          python-version: "3.11"

      - name: Install deps
        run: pip install -r requirements.txt

      - name: Train model (fresh artifact)
        run: python training/train.py

      - name: Run tests
        run: pytest -q

      - name: Build Docker image
        run: |
          IMAGE_URI=${{ secrets.AWS_ACCOUNT_ID }}.dkr.ecr.${{ env.AWS_REGION }}.amazonaws.com/${{ env.ECR_REPO }}:${{ env.IMAGE_TAG }}
          docker build -t $IMAGE_URI .

      - name: Trivy scan (optional but recommended)
        uses: aquasecurity/trivy-action@0.28.0
        with:
          image-ref: ${{ secrets.AWS_ACCOUNT_ID }}.dkr.ecr.${{ env.AWS_REGION }}.amazonaws.com/${{ env.ECR_REPO }}:${{ env.IMAGE_TAG }}
          format: 'table'
          vuln-type: 'os,library'
          exit-code: '0'  # don't fail build initially; tighten later

      - name: Configure AWS credentials (OIDC)
        uses: aws-actions/configure-aws-credentials@v4
        with:
          role-to-assume: ${{ secrets.AWS_ROLE_TO_ASSUME }}
          aws-region: ${{ env.AWS_REGION }}

      - name: Login to ECR
        id: ecr
        uses: aws-actions/amazon-ecr-login@v2

      - name: Push to ECR
        run: |
          IMAGE_URI=${{ secrets.AWS_ACCOUNT_ID }}.dkr.ecr.${{ env.AWS_REGION }}.amazonaws.com/${{ env.ECR_REPO }}:${{ env.IMAGE_TAG }}
          docker push $IMAGE_URI

      - name: Render task definition
        id: taskdef
        run: |
          IMAGE_URI=${{ secrets.AWS_ACCOUNT_ID }}.dkr.ecr.${{ env.AWS_REGION }}.amazonaws.com/${{ env.ECR_REPO }}:${{ env.IMAGE_TAG }}
          cat > taskdef.json << 'JSON'
          {
            "family": "ml-iris-td",
            "networkMode": "awsvpc",
            "requiresCompatibilities": ["FARGATE"],
            "cpu": "256",
            "memory": "512",
            "executionRoleArn": "arn:aws:iam::${ACCOUNT_ID}:role/ecsTaskExecutionRole",
            "taskRoleArn": "arn:aws:iam::${ACCOUNT_ID}:role/ecsTaskRole",
            "containerDefinitions": [{
              "name": "ml-iris",
              "image": "${IMAGE_URI}",
              "essential": true,
              "portMappings": [{"containerPort": 8000, "protocol": "tcp"}],
              "logConfiguration": {
                "logDriver": "awslogs",
                "options": {
                  "awslogs-group": "/ecs/ml-iris",
                  "awslogs-region": "${REGION}",
                  "awslogs-stream-prefix": "ecs"
                }
              },
              "healthCheck": {
                "command": ["CMD-SHELL", "curl -f http://localhost:8000/health || exit 1"],
                "interval": 30,
                "timeout": 5,
                "retries": 3,
                "startPeriod": 10
              }
            }]
          }
          JSON
          sed -i "s|\${IMAGE_URI}|$IMAGE_URI|g" taskdef.json
          sed -i "s|\${ACCOUNT_ID}|${{ secrets.AWS_ACCOUNT_ID }}|g" taskdef.json
          sed -i "s|\${REGION}|${{ env.AWS_REGION }}|g" taskdef.json
          cat taskdef.json

      - name: Register new task definition
        id: register
        run: |
          ARN=$(aws ecs register-task-definition --cli-input-json file://taskdef.json --query 'taskDefinition.taskDefinitionArn' --output text)
          echo "TASK_DEF_ARN=$ARN" >> $GITHUB_OUTPUT

      - name: Deploy service (rolling update)
        run: |
          aws ecs update-service \
            --cluster "${{ env.ECS_CLUSTER }}" \
            --service "${{ env.ECS_SERVICE }}" \
            --task-definition "${{ steps.register.outputs.TASK_DEF_ARN }}" \
            --force-new-deployment

      - name: Wait for stability
        run: |
          aws ecs wait services-stable \
            --cluster "${{ env.ECS_CLUSTER }}" \
            --services "${{ env.ECS_SERVICE }}"

Replace secrets.AWS_ACCOUNT_ID, roles, cluster & service names to match your account.

11) Create the ECS service (once)

If you didn’t do the console wizard, you can create the service after the first task definition registers:

# Example (adjust subnets/securitygroups/ALB target group ARNs accordingly)
aws ecs create-service \
  --cluster ml-cluster \
  --service-name ml-iris-service \
  --task-definition ml-iris-td \
  --desired-count 1 \
  --launch-type FARGATE \
  --network-configuration "awsvpcConfiguration={subnets=[subnet-xxx,subnet-yyy],securityGroups=[sg-zzz],assignPublicIp=ENABLED}" \
  --load-balancers "targetGroupArn=arn:aws:elasticloadbalancing:ap-south-1:<ACCOUNT_ID>:targetgroup/ml-tg/abc123,containerName=ml-iris,containerPort=8000"

Point your ALB listener (port 80/443) to that target group. Health check path: /health.

12) Try it live

Once the workflow finishes, hit the ALB DNS:

curl http://<your-alb-dns>/health
curl -X POST http://<your-alb-dns}/predict -H "Content-Type: application/json" \
  -d '{"instances": [[6.2, 2.8, 4.8, 1.8]]}'

You should see a species prediction (e.g., virginica).

13) Production tips

Reproducible training: Move training into a separate job/artifact, version artifacts (model-YYYYMMDD.pkl) and pin them in releases.

Model registry: Store artifacts in S3 with versioning; pass S3 URL via ECS task env var.

Secrets: Use SSM Parameter Store or Secrets Manager; mount via task definition.

Observability: Enable AWS Logs Insights for /ecs/ml-iris, add request metrics via a sidecar (e.g., Prometheus exporter) or API middleware.

Autoscaling: Configure Target Tracking on ECS service (CPU/Memory or ALB RequestCount).

Blue/Green: Use CodeDeploy for zero-downtime if you need canaries/linear rollouts.

Security: Restrict IAM to least privilege, use private subnets + NAT in production.

14) Local developer experience (optional)

docker-compose.yml for quick spins:

version: "3.9"
services:
  api:
    build: .
    ports:
      - "8000:8000"

15) Common pitfalls & fixes

Model file missing in container
Make sure models/model.pkl exists before building the image (run training in CI first).

ALB health check failing
Confirm containerPort 8000, target group health check path /health, security groups allow ALB → ECS.

Permission errors in CI
Recheck IAM role trust policy sub matches repo:OWNER/REPO:*, and id-token: write is enabled.

In this tutorial, you trained a scikit-learn model, exposed it via a FastAPI endpoint, containerized it with Docker, and built a secure CI/CD pipeline using GitHub Actions’ OIDC to deploy on AWS ECS Fargate behind an Application Load Balancer. Every push to main tests your code, scans your image, pushes to ECR, and rolls out a new task—giving you a reproducible, auditable path from experimentation to real-world traffic.

ProjectDevOps

NGO partners

Upcoming platforms