#11. SQL vs NoSQL Decision Guide - Flexibility vs Scale
The One Thing to Remember
SQL = relationships and consistency. NoSQL = flexibility and scale. Choose SQL when you need complex queries and ACID. Choose NoSQL when you need horizontal scaling, flexible schemas, or specific data models (documents, graphs, time-series).
Building on Article 10
In Article 10: Isolation Levels & Anomalies, you learned how SQL databases provide different isolation levels. But here's the question: When should you use SQL vs NoSQL, and what are the real trade-offs?
Choosing the wrong database is expensive—migrations take months, performance problems are hard to fix, and team expertise is wasted.
← Previous: Article 10 - Isolation Levels & Anomalies
Why This Matters (A Migration Horror Story)
I once watched a team migrate from MongoDB to PostgreSQL after 6 months of data consistency issues. They used MongoDB for financial transactions, then struggled with eventual consistency. The migration took 3 months and cost hundreds of thousands in engineering time. The lesson? Choose the right database from the start.
This isn't academic knowledge—it's the difference between:
-
Building on the right foundation vs rebuilding later
- Understanding trade-offs = you choose correctly
- Not understanding = you migrate later, waste months
-
Scaling successfully vs hitting walls
- Understanding scaling models = you plan for growth
- Not understanding = you hit limits, panic, rebuild
-
Team productivity vs constant firefighting
- Understanding database strengths = you use the right tool
- Not understanding = you fight the database constantly
The Categories
┌─────────────────────────────────────────────────────────────────┐
│ DATABASE LANDSCAPE │
├─────────────────────────────────────────────────────────────────┤
│ │
│ RELATIONAL (SQL) │
│ ═════════════════ │
│ PostgreSQL, MySQL, SQLite, SQL Server, Oracle │
│ • Tables with rows and columns │
│ • Fixed schema │
│ • ACID transactions │
│ • Complex joins and queries │
│ │
│ DOCUMENT │
│ ══════════ │
│ MongoDB, CouchDB, Amazon DocumentDB │
│ • JSON-like documents │
│ • Flexible schema │
│ • Good for nested data │
│ • Limited joins │
│ │
│ KEY-VALUE │
│ ═══════════ │
│ Redis, Memcached, Amazon DynamoDB, etcd │
│ • Simple get/set by key │
│ • Extremely fast │
│ • Limited query capability │
│ • Good for caching, sessions │
│ │
│ WIDE-COLUMN │
│ ════════════ │
│ Cassandra, HBase, ScyllaDB │
│ • Rows with dynamic columns │
│ • Optimized for writes │
│ • Horizontal scaling │
│ • Good for time-series, IoT │
│ │
│ GRAPH │
│ ═══════ │
│ Neo4j, Amazon Neptune, ArangoDB │
│ • Nodes and relationships │
│ • Optimized for traversals │
│ • Good for social networks, recommendations │
│ │
│ TIME-SERIES │
│ ════════════ │
│ InfluxDB, TimescaleDB, Prometheus │
│ • Optimized for time-stamped data │
│ • Compression for time data │
│ • Good for metrics, IoT, financial data │
│ │
└─────────────────────────────────────────────────────────────────┘
Common Mistakes (I've Made These)
Mistake #1: "NoSQL because it scales"
Why it's wrong: PostgreSQL handles millions of rows easily. Most apps never need NoSQL scale. SQL + read replicas handles most load.
Right approach: Start with SQL, migrate if needed. Don't prematurely optimize with NoSQL.
Mistake #2: "MongoDB for everything"
Why it's wrong: MongoDB's flexible schema becomes a mess. Lack of transactions (historically) caused real data loss. Joins across collections are painful.
Right approach: Use MongoDB for actual document data (nested, varying structure). Don't use MongoDB as a "schemaless SQL."
Mistake #3: "One database for all"
Why it's wrong: Different data has different needs. Logging doesn't need ACID. User auth doesn't need Cassandra's scale.
Right approach: Use the right tool for each job. Most production systems use multiple databases.
SQL vs NoSQL: Core Trade-offs
┌─────────────────────────────────────────────────────────────────┐
│ │
│ SQL (Relational) NoSQL (Various) │
│ ═════════════════ ═══════════════ │
│ │
│ Schema: Fixed (strict) Flexible (schemaless) │
│ │
│ Scaling: Vertical (bigger box) Horizontal (more boxes) │
│ + Read replicas Native sharding │
│ │
│ ACID: Full support Varies (often eventual) │
│ │
│ Queries: Complex joins, Limited joins, │
│ aggregations document/key lookups │
│ │
│ Best for: Complex relationships High volume, simple access │
│ Transactions Flexible data models │
│ Unknown query patterns Known access patterns │
│ │
└─────────────────────────────────────────────────────────────────┘
When to Choose SQL
Strong Use Cases
✅ USE SQL WHEN:
1. Complex relationships between entities
• E-commerce: Users → Orders → Products → Categories
• ERP systems with hundreds of related tables
2. ACID transactions are critical
• Financial systems
• Inventory management
• Any "money" operations
3. Unknown query patterns
• Business intelligence
• Ad-hoc reporting
• Data exploration
4. Data integrity is paramount
• Healthcare records
• Audit trails
• Compliance requirements
5. Your data is naturally tabular
• Spreadsheet-like data
• Consistent structure across records
SQL Example: E-commerce
-- Natural fit: Related entities with constraints
CREATE TABLE users (
id SERIAL PRIMARY KEY,
email VARCHAR(255) UNIQUE NOT NULL
);
CREATE TABLE orders (
id SERIAL PRIMARY KEY,
user_id INTEGER REFERENCES users(id),
status VARCHAR(50),
total DECIMAL(10,2),
created_at TIMESTAMP DEFAULT NOW()
);
-- Complex query: Top customers by category
SELECT u.email, c.name as category, SUM(oi.price * oi.quantity) as total
FROM users u
JOIN orders o ON u.id = o.user_id
JOIN order_items oi ON o.id = oi.order_id
JOIN products p ON oi.product_id = p.id
JOIN categories c ON p.category_id = c.id
GROUP BY u.email, c.name
ORDER BY total DESC;
When to Choose NoSQL
Document Database (MongoDB)
✅ USE DOCUMENT DB WHEN:
1. Data is naturally hierarchical/nested
• Blog posts with comments and tags
• Product catalogs with varying attributes
2. Schema varies per record
• Different product types have different fields
• User-generated content
3. Rapid iteration on schema
• Startup MVPs
• Prototyping
4. Self-contained documents
• Each document has all needed data
• Minimal need for joins
Key-Value Store (Redis)
✅ USE KEY-VALUE WHEN:
1. Simple access patterns
• Get user session by ID
• Cache computed results
2. Extreme performance needed
• Sub-millisecond latency
• High throughput
3. Ephemeral data
• Sessions, caches
• Rate limiting counters
Wide-Column (Cassandra)
✅ USE WIDE-COLUMN WHEN:
1. Write-heavy workloads
• IoT sensor data
• User activity logs
2. Time-series data at scale
• Metrics, events
• Financial ticks
3. Horizontal scaling required
• Petabytes of data
• Global distribution
The Hybrid Approach (Most Real Systems)
┌─────────────────────────────────────────────────────────────────┐
│ TYPICAL PRODUCTION ARCHITECTURE │
├─────────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────────┐ │
│ │ Application │ │
│ └────────┬────────┘ │
│ │ │
│ ┌────────────┬────────────┼────────────┬─────────────┐ │
│ ▼ ▼ ▼ ▼ ▼ │
│ ┌────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌─────────┐ │
│ │ Redis │ │PostgreSQL│ │ MongoDB │ │Cassandra │ │ S3 │ │
│ │ │ │ │ │ │ │ │ │ │ │
│ │Sessions│ │ Users │ │ Product │ │ Events │ │ Files │ │
│ │ Cache │ │ Orders │ │ Catalog │ │ Metrics │ │ Media │ │
│ │Rate Lim│ │ Payments │ │ Reviews │ │ Logs │ │ Backups │ │
│ └────────┘ └──────────┘ └──────────┘ └──────────┘ └─────────┘ │
│ │
│ Each database does what it's best at! │
└─────────────────────────────────────────────────────────────────┘
Real example: Netflix
- Cassandra: Viewing history (write-heavy, time-series)
- MySQL: Billing and accounts (ACID transactions)
- ElasticSearch: Search and discovery
- Redis: Session cache, real-time data
- S3: Video file storage
Lesson: Most production systems use multiple databases. Use the right tool for each job.
Decision Framework
START HERE
│
▼
┌───────────────────────────────────────────┐
│ Do you need complex joins and │
│ transactions across multiple entities? │
└───────────────────┬───────────────────────┘
│
┌───────────┴───────────┐
▼ ▼
YES NO
│ │
▼ ▼
┌─────────┐ ┌───────────────────────────────┐
│ SQL │ │ Is your data naturally │
│PostgreSQL│ │ nested/hierarchical? │
│ MySQL │ └───────────────┬───────────────┘
└─────────┘ │
┌───────────┴───────────┐
▼ ▼
YES NO
│ │
▼ ▼
┌───────────┐ ┌───────────────────────────┐
│ Document │ │ Is it simple key-value │
│ MongoDB │ │ access patterns? │
└───────────┘ └───────────────┬───────────┘
│
┌───────────┴───────────┐
▼ ▼
YES NO
│ │
▼ ▼
┌───────────┐ ┌──────────────────────┐
│ Key-Value │ │ Is it time-series │
│ Redis │ │ or write-heavy? │
│ DynamoDB │ └──────────┬───────────┘
└───────────┘ │
┌────────────┴────────────┐
▼ ▼
YES NO
│ │
▼ ▼
┌───────────┐ ┌──────────────────┐
│Wide-Column│ │ Relationships │
│ Cassandra │ │ are the focus? │
│InfluxDB │ │ │
└───────────┘ │ YES → Graph DB │
│ NO → SQL/Document│
└──────────────────┘
Quick Reference
| Need | Best Choice | Why |
|---|---|---|
| E-commerce orders | PostgreSQL | ACID, relationships |
| Product catalog | MongoDB | Varying attributes |
| User sessions | Redis | Fast, ephemeral |
| Event logs | Cassandra | Write-heavy, scale |
| Social graph | Neo4j | Relationship traversals |
| Metrics | InfluxDB | Time-series optimized |
| Search | Elasticsearch | Full-text, facets |
| Files | S3 | Blob storage |
Key Takeaways
- SQL is the safe default - handles most use cases well
- NoSQL for specific needs - not just "because scale"
- Hybrid is normal - use multiple databases for different data (Netflix uses 5+)
- Consider operations - can your team run Cassandra? (it's complex)
- Plan for migration - wrong choice isn't forever, but it's expensive (months of work)
- Start simple - PostgreSQL handles most apps. Migrate when you hit real limits.
What's Next
Now that you understand when to choose SQL vs NoSQL, the next question is: How do you scale beyond a single machine?
In the next article, CAP Theorem Demystified - Consistency vs Availability, you'll learn:
- What CAP theorem actually means (most people get it wrong)
- Why partition tolerance isn't optional
- How to choose between consistency and availability
- Real-world examples of CP vs AP systems
This builds on what you learned here—NoSQL databases often make different CAP trade-offs than SQL databases.
→ Continue to Article 12: CAP Theorem
This article is part of the Backend Engineering Mastery series. Choosing the right database is one of the most important architectural decisions.