Choosing the right database is one of the most critical decisions in a system design. Your choice impacts:
Performance under load
Scalability as data grows
Complexity of handling real-world scenarios
To help you prepare, here’s a breakdown of the 10 essential database types you need to know. For each, we’ll cover:
✅ What it is
✅ When to use it (with real-world examples)
✅ Key design considerations
✅ Popular databases to reference in interviews
1. Relational Databases
Stores data in structured tables (rows and columns) with defined relationships. Uses SQL for querying.
When to Use
✔ Structured, relational data (e.g., e-commerce with Users
, Orders
, Products
tables)
✔ Strong consistency & ACID compliance (e.g., banking transactions)
✔ Complex queries & reporting (joins, aggregations)
Design Considerations
🔹 Indexing – Speed up reads but can slow writes (index user_id
, email
).
🔹 Normalization vs. Denormalization – Normalize for consistency; denormalize for read-heavy workloads.
🔹 Sharding – Split data horizontally (use high-cardinality keys like user_id
).
🔹 Scaling – Vertical (add CPU/RAM) or horizontal (read replicas, caching).
Example Databases
PostgreSQL (open-source, feature-rich)
MySQL (LAMP stack staple)
Oracle DB (enterprise-grade)
2. In-Memory Databases
Stores data in RAM instead of disk—ideal for ultra-low latency.
When to Use
✔ Real-time applications (e.g., gaming leaderboards)
✔ Caching layer (e.g., Redis for session storage)
✔ Temporary data (e.g., rate-limiting counters)
Design Considerations
🔹 Volatility – Data lost on crash unless persisted (Redis offers RDB snapshots).
🔹 Eviction Policies – LRU, LFU, or TTL to manage limited RAM.
🔹 Replication – Async replication for failover (but risk of data loss).
Example Databases
Redis (supports rich data structures)
Memcached (simple key-value caching)
3. Key-Value Stores
Simple key → value pairs (like a distributed hashmap).
When to Use
✔ Fast lookups by key (e.g., URL shorteners, session stores)
✔ High-throughput workloads (millions of ops/sec)
✔ No complex queries needed
Design Considerations
🔹 No joins or secondary indexes – Only key-based access.
🔹 Schema-less – Values can be JSON, strings, or binary blobs.
🔹 Easy horizontal scaling – Consistent hashing for distribution.
Example Databases
Redis (also supports advanced structures)
DynamoDB (managed, scalable)
4. Document Databases
Stores flexible JSON-like documents (schema-less).
When to Use
✔ Variable data structures (e.g., CMS with different content types)
✔ Nested/hierarchical data (e.g., user profiles with embedded addresses)
✔ Rapid iteration (no schema migrations)
Design Considerations
🔹 Indexing – Critical for performance (index user_id
, email
).
🔹 Document size limits – MongoDB caps at 16MB; split if needed.
🔹 Denormalization – Embed related data to avoid joins.
Example Databases
MongoDB (most popular)
Firestore (realtime updates for apps)
5. Graph Databases
Optimized for relationships (nodes + edges).
When to Use
✔ Social networks (friend-of-friend queries)
✔ Recommendation engines ("users who bought X also bought Y")
✔ Fraud detection (pattern analysis)
Design Considerations
🔹 Traversal efficiency – Handles deep relationships better than SQL joins.
🔹 Query languages – Cypher (Neo4j) or Gremlin.
🔹 Scalability – Some (Neo4j Enterprise) support distributed graphs.
Example Databases
Neo4j (industry leader)
Amazon Neptune (managed service)
6. Wide-Column Stores
Like spreadsheets on steroids—each row can have different columns.
When to Use
✔ Massive write scalability (e.g., IoT sensor data)
✔ Time-series or sparse data (e.g., user activity logs)
Design Considerations
🔹 Schema design – Partition keys impact performance (avoid hotspots).
🔹 Denormalization – Joins are expensive; duplicate data instead.
Example Databases
Cassandra (Netflix, Instagram)
ScyllaDB (high-performance alternative)
7. Time-Series Databases
Built for timestamped data (metrics, logs).
When to Use
✔ Monitoring/observability (e.g., Prometheus for Kubernetes)
✔ Financial tick data
✔ Rollup aggregations (e.g., daily averages)
Design Considerations
🔹 Time-based indexing – Queries are fast within time ranges.
🔹 Downsampling – Aggregate raw data to save space.
Example Databases
InfluxDB
TimescaleDB (PostgreSQL extension)
8. Text-Search Databases
Optimized for full-text search (inverted indexes, fuzzy matching).
When to Use
✔ E-commerce search (e.g., "running shoes" with filters)
✔ Log analysis (free-text log queries)
Design Considerations
🔹 Tokenization & stemming – "Running" → "run" for better matches.
🔹 Relevance scoring – TF-IDF or BM25 ranking.
Example Databases
Elasticsearch (most popular)
Solr (enterprise search)
9. Spatial Databases
Handles geographic data (locations, shapes).
When to Use
✔ Ride-hailing apps (find nearby drivers)
✔ Geofencing (e.g., delivery zones)
Design Considerations
🔹 Spatial indexing – R-trees for efficient queries.
🔹 Approximations – Bounding boxes for performance.
Example Databases
PostGIS (PostgreSQL extension)
MongoDB (geospatial queries)
10. Blob Stores
For large binary files (images, videos).
When to Use
✔ Media storage (e.g., YouTube videos)
✔ Backups & logs
Design Considerations
🔹 Metadata management – Store in a separate DB.
🔹 CDN integration – Speed up global delivery.
Example Services
Amazon S3 (industry standard)
Google Cloud Storage
Final Thoughts
Each database type excels in specific scenarios. In interviews:
Identify access patterns (reads vs. writes, query complexity).
Consider scalability needs (vertical vs. horizontal).
Combine databases if needed (e.g., Redis cache + PostgreSQL).