How We Cut AWS Costs by 85%: From $487 to $73/Month

📅 January 13, 2025 ⏱️ 18 min read 👤 Louis Castaneda
$4,968/year saved

85% reduction in AWS costs

Healthcare SaaS Platform | 10,000 Daily Users | Zero Downtime Migration

This isn't another "we saved money on AWS" fluff piece. This is the exact playbook we used to cut a healthcare platform's AWS bill from $487 to $73/month - including the mistakes we made and the code we wrote.

📊 The Starting Point: A Typical Over-Provisioned Setup

❌ Before: $487/month

  • 2x m5.large EC2 instances ($146.40)
  • Application Load Balancer ($22.50)
  • RDS MySQL t3.medium ($89.28)
  • 200 GB EBS storage ($20.00)
  • NAT Gateway ($45.00)
  • Data transfer (~$45.00)
  • CloudWatch, backups, etc. ($118.82)

✅ After: $73/month

  • Lambda functions ($42.00)
  • API Gateway (included)
  • DynamoDB on-demand ($25.00)
  • S3 storage ($2.30)
  • CloudFront CDN ($1.20)
  • CloudWatch Logs ($0.50)
  • Minimal data transfer ($2.00)

🎯 Step 1: The Discovery Phase (Week 1)

We started by installing CloudWatch detailed monitoring and actually looking at the data. The results were shocking:

What We Found:

  • 📊 Average CPU usage: 8% (paying for 92% idle time)
  • 💾 Memory usage: 2.1 GB of 8 GB available
  • 🌐 Traffic pattern: 80% of requests between 9 AM - 5 PM
  • 💤 Night usage: < 100 requests/hour (still paying full price)
  • 🗄️ Database: 43 GB used of 100 GB allocated
  • Response time: 847ms average (mostly database queries)

🚨 Mistake #1: Not Checking Actual Usage First

The client had been running oversized EC2 instances for 2 years based on "expected growth" that never materialized. They were literally paying $350/month for unused capacity.

📐 Step 2: Architecture Redesign (Week 2)

Day 1-2: Mapped All Dependencies

Created a complete diagram of every service, API endpoint, database table, and external integration. Found 23 endpoints, but only 5 were used daily.

Day 3-4: Designed Serverless Architecture

Planned migration to Lambda + API Gateway + DynamoDB. Identified which endpoints could be migrated first (stateless ones).

Day 5-7: Proof of Concept

Migrated the healthcheck endpoint to Lambda. It worked perfectly and cost $0.0000002 per request vs $0.00014 on EC2.

💻 Step 3: The Actual Migration (Weeks 3-4)

Phase 1: API Migration to Lambda

❌ Before: Express.js on EC2

// app.js - Running 24/7 on EC2
const express = require('express');
const mysql = require('mysql2');
const app = express();

const db = mysql.createConnection({
  host: 'rds-instance.aws.com',
  user: 'admin',
  password: process.env.DB_PASS,
  database: 'healthapp'
});

app.get('/api/patients/:id', (req, res) => {
  db.query(
    'SELECT * FROM patients WHERE id = ?',
    [req.params.id],
    (err, results) => {
      if (err) return res.status(500).json({err});
      res.json(results[0]);
    }
  );
});

app.listen(3000); // Running forever

✅ After: Lambda Function

// getPatient.js - Only runs when needed
const AWS = require('aws-sdk');
const dynamodb = new AWS.DynamoDB.DocumentClient();

exports.handler = async (event) => {
  const { id } = event.pathParameters;
  
  try {
    const result = await dynamodb.get({
      TableName: 'patients',
      Key: { patientId: id }
    }).promise();
    
    return {
      statusCode: 200,
      headers: {
        'Access-Control-Allow-Origin': '*'
      },
      body: JSON.stringify(result.Item)
    };
  } catch (error) {
    return {
      statusCode: 500,
      body: JSON.stringify({ error: error.message })
    };
  }
};

Phase 2: Database Migration (The Tricky Part)

Moving from RDS MySQL to DynamoDB was the scariest part. Here's exactly how we did it:

Step 1: Analyzed Query Patterns

# Found that 94% of queries were simple key-value lookups
SELECT * FROM patients WHERE patient_id = ?    -- 67%
SELECT * FROM appointments WHERE date = ?       -- 18%
SELECT * FROM medications WHERE patient_id = ? -- 9%
Complex JOINs                                  -- 6%

Step 2: Designed DynamoDB Schema

{
  "TableName": "healthcare-app",
  "PartitionKey": "PK",      // PATIENT#12345
  "SortKey": "SK",           // PROFILE or APPT#2024-01-15
  "GSI1": {
    "PartitionKey": "GSI1PK", // DATE#2024-01-15
    "SortKey": "GSI1SK"        // APPT#12345
  }
}

Step 3: Dual-Write Strategy

For 2 weeks, we wrote to both MySQL and DynamoDB, but only read from MySQL. This let us verify data integrity with zero risk.

Step 4: Gradual Cutover

Switched reads to DynamoDB one endpoint at a time. Started with low-traffic endpoints, monitored for 24 hours, then proceeded.

💡 Key Learning: Single-Table Design

Instead of 12 MySQL tables, we used ONE DynamoDB table with composite keys. This reduced costs by 60% and actually improved query performance.

🚀 Step 4: Performance Optimization (Week 5)

The Unexpected Performance Gains

Before Performance

  • API Response: 847ms average
  • Cold starts: N/A (always running)
  • Scaling: Manual (scary)
  • Availability: 99.5% (2 outages/year)

After Performance

  • API Response: 124ms average
  • Cold starts: 1.2s (< 1% of requests)
  • Scaling: Automatic (infinite)
  • Availability: 99.99% (AWS managed)

Optimization Techniques We Used:

  1. Lambda Memory Tuning: Started at 128MB, tested up to 3GB. Found sweet spot at 512MB (best cost/performance).
  2. Connection Pooling: Reused DynamoDB connections across invocations.
  3. CloudFront Caching: Cached API responses for 60 seconds (reduced Lambda invocations by 40%).
  4. Provisioned Concurrency: Set 2 warm instances for morning traffic spike.

💰 Step 5: Cost Optimization Tricks (Week 6)

Advanced Cost Savings We Implemented:

1. S3 Intelligent-Tiering

aws s3api put-bucket-lifecycle-configuration \
  --bucket healthcare-uploads \
  --lifecycle-configuration file://lifecycle.json

# Saved $18/month on old patient files

2. DynamoDB On-Demand vs Provisioned

Started with on-demand ($25/month). Would need 50% more traffic to justify provisioned capacity.

3. CloudWatch Logs Retention

aws logs put-retention-policy \
  --log-group-name /aws/lambda/patient-api \
  --retention-in-days 7

# Saved $12/month on log storage

4. Lambda ARM Architecture

Switched to Graviton2 (arm64) - 20% cheaper, 19% faster!

🚨 The Mistakes We Made (So You Don't Have To)

Mistake #2: Forgetting About VPC

Initially put Lambda in VPC for "security". This added NAT Gateway ($45/month) and cold starts (3+ seconds). Removed VPC, used IAM roles instead. Saved $45/month and 2.8 seconds per cold start.

Mistake #3: Over-Engineering Logging

Set up elaborate X-Ray tracing and custom metrics. Cost: $38/month. Actual value: minimal. Removed it, relied on basic CloudWatch Logs.

Mistake #4: Not Setting DLQs

Lambda function had infinite retries on one endpoint. One bad request triggered 10,000 invocations. Cost: $2. Could have been $200 if not caught quickly.

📈 The Results: 6 Months Later

Final Numbers

$414
Monthly Savings
85%
Cost Reduction
7x
Faster API
Scalability

Unexpected Benefits:

  • Developer productivity: Deployments went from 30 minutes to 30 seconds
  • No more maintenance windows: Zero-downtime deployments
  • Better monitoring: Lambda's built-in metrics are superior
  • Compliance: Easier HIPAA compliance with serverless
  • Team morale: Developers love not managing servers

📋 Your 30-Day Migration Checklist

Week 1: Analysis

  • ☐ Enable CloudWatch detailed monitoring
  • ☐ Document actual CPU/memory usage
  • ☐ Map all endpoints and dependencies
  • ☐ Calculate per-request costs

Week 2: Planning

  • ☐ Design serverless architecture
  • ☐ Choose migration order (easiest first)
  • ☐ Set up development environment
  • ☐ Create rollback plan

Week 3-4: Migration

  • ☐ Migrate stateless endpoints first
  • ☐ Implement dual-write for databases
  • ☐ Run parallel systems for validation
  • ☐ Monitor costs daily

Week 5: Optimization

  • ☐ Tune Lambda memory settings
  • ☐ Implement caching strategies
  • ☐ Remove unnecessary services
  • ☐ Celebrate massive savings! 🎉

🔑 Key Takeaways

The 80/20 Rule of AWS Savings

  • 80% of savings came from eliminating idle EC2 capacity
  • 15% of savings from right-sizing resources
  • 5% of savings from advanced optimizations

Focus on the big wins first. You can optimize forever, but the first 20% of effort yields 80% of savings.

🚀 Ready to Cut Your AWS Costs?

Get Your Free AWS Cost Audit

We'll analyze your AWS infrastructure and show you exactly where you're overspending. Average client savings: $4,200/month.

  • ✅ Complete cost breakdown analysis
  • ✅ Serverless migration feasibility assessment
  • ✅ Custom optimization roadmap
  • ✅ ROI calculations and timeline
Get Free AWS Audit → Save $4,200/month

❓ FAQ

Q: How long did the migration really take?

6 weeks total: 2 weeks planning, 3 weeks migration, 1 week optimization. Could have been 4 weeks if we hadn't made the VPC mistake.

Q: What about vendor lock-in?

Yes, we're more locked into AWS now. But we're saving $4,968/year. That buys a lot of migration budget if needed.

Q: Did anything break during migration?

One pagination endpoint had a bug for 2 hours. That's it. The dual-write strategy prevented data issues.

Q: Is serverless always cheaper?

No. If CPU usage is consistently > 60%, EC2 might be cheaper. But that's rare - we've seen it in only 15% of applications.

Q: What about cold starts?

They affect < 1% of requests and add 1.2 seconds. For a 85% cost savings, the client gladly accepted this tradeoff.