Databases An extensive overview of databases, their types, use cases, diagrams, and real-world examples.
In system design and high-level architecture (HLD) , databases form the backbone of any system.
Choosing the right database depends on data structure, query patterns, scalability, and consistency requirements .
Use Case Best DB Type Examples Notes Strict consistency, complex queries Relational (SQL) PostgreSQL, MySQL, Oracle ACID, joins, transactions High-speed caching Key-Value Redis, DynamoDB Simple get/set, low latency Flexible, evolving data Document MongoDB, Firestore JSON, nested, schema-less Analytics on huge data Wide-Column Cassandra, Bigtable Partitioned, scalable Social graphs, relationships Graph Neo4j, ArangoDB Traversals, relationships Time-stamped metrics Time-Series InfluxDB, TimescaleDB Optimized for time, compression Full-text search Search Elasticsearch, Solr Text indexing, scoring AI/ML similarity search Vector Pinecone, Milvus Embeddings, ANN search Multiple models in one Multi-Model ArangoDB, OrientDB, Cosmos DB Flexible, complex apps Immutable audit logs Ledger/Blockchain QLDB, Hyperledger Tamper-proof, auditable Mobile/embedded Embedded SQLite, RocksDB, Realm Lightweight, local Distributed SQL, global scale NewSQL CockroachDB, Spanner, YugabyteDB ACID + horizontal scaling
Structure: Tables (rows & columns), fixed schema, ACID properties
Strengths: Strong consistency, complex queries, joins, transactions
Weaknesses: Harder to scale horizontally, rigid schema
When NOT to use: If you need flexible schema, massive horizontal scaling, or handle unstructured data.
Popular Managed Options: AWS RDS, Azure SQL, Google Cloud SQL
Performance/Scalability: Vertical scaling, some support for read replicas and sharding.
Cost: Can be expensive at scale, especially for licensing (Oracle, SQL Server).
Banking, ERP systems, e-commerce orders, financial transactions
PostgreSQL, MySQL, Oracle, SQL Server
Structure: Simple key β value mapping
Strengths: Extremely fast, simple retrieval, easy to scale horizontally
Weaknesses: Limited query functionality, no relationships
When NOT to use: If you need complex queries or relationships.
Popular Managed Options: AWS DynamoDB, Azure Table Storage, Google Cloud Datastore
Performance/Scalability: Excellent horizontal scaling, in-memory options (Redis).
Cost: Pay-per-request (DynamoDB), memory cost (Redis).
Caching, session storage, shopping carts, leaderboards
Redis, DynamoDB, Memcached
+userId_1001 +name_Alice +age_30
+sessionId_abcd +userId_1001 +expires_10min
+userId_1001 +name_Alice +age_30
+sessionId_abcd +userId_1001 +expires_10min
Structure: JSON-like documents, flexible schema
Strengths: Hierarchical data, flexible, easy to update, semi-structured
Weaknesses: Less efficient for joins, eventual consistency in some cases
When NOT to use: If you need complex joins or strict schema.
Popular Managed Options: MongoDB Atlas, Firebase Firestore, AWS DocumentDB
Performance/Scalability: Good horizontal scaling, sharding.
Cost: Storage + throughput, can be expensive at scale.
CMS, product catalogs, IoT apps, user profiles
MongoDB, Firestore, CouchDB
Structure: Columns grouped in families for scalable analytics
Strengths: Scalable, optimized for analytics, high write throughput
Weaknesses: Complex schema design, not ideal for transactional workloads
When NOT to use: If you need ACID transactions or simple queries.
Popular Managed Options: Google Bigtable, Azure Cosmos DB (Cassandra API), AWS Keyspaces
Performance/Scalability: Excellent horizontal scaling, partitioning.
Cost: Pay-per-use, can be high for large clusters.
Big data analytics, recommendation engines, time-series-like workloads
Cassandra, HBase, Bigtable
Structure: Nodes and edges
Strengths: Complex relationship queries, traversals, flexible schema
Weaknesses: Slower for bulk inserts, not ideal for analytics
When NOT to use: If you have simple, flat data or need high write throughput.
Popular Managed Options: Neo4j Aura, AWS Neptune, Azure Cosmos DB (Gremlin API)
Performance/Scalability: Good for traversals, can be hard to scale for massive graphs.
Cost: Pay-per-node/edge, can be high for large graphs.
Social networks, recommendations, fraud detection, knowledge graphs
Neo4j, ArangoDB, JanusGraph
Structure: Indexed by timestamp
Strengths: Optimized for sequential data, compression, fast range queries
Weaknesses: Narrow use case, not for general-purpose data
When NOT to use: If you need complex relationships or non-time-based queries.
Popular Managed Options: InfluxDB Cloud, AWS Timestream, Timescale Cloud
Performance/Scalability: High write throughput, retention policies.
Cost: Pay-per-ingest, storage.
IoT, monitoring, financial tick data, metrics
InfluxDB, TimescaleDB, Prometheus
Timestamp Temperature_C 2025-09-11 10:00 23.5 2025-09-11 11:00 24.0 2025-09-11 12:00 23.8
Structure: Indexed text for full-text search
Strengths: Full-text search, scoring, faceting, analytics
Weaknesses: Not transactional, eventual consistency
When NOT to use: For transactional or relational data.
Popular Managed Options: Elastic Cloud, AWS OpenSearch, Azure Cognitive Search
Performance/Scalability: Scales horizontally, sharding.
Cost: Storage + compute, can be high for large indexes.
E-commerce search, logs, document search
Elasticsearch, Solr, OpenSearch
Structure: High-dimensional vectors
Strengths: Semantic/AI queries, similarity search, ANN (approximate nearest neighbor)
Weaknesses: Still evolving, not for transactional data
When NOT to use: For traditional queries or transactional workloads.
Popular Managed Options: Pinecone, Weaviate Cloud, Milvus Cloud
Performance/Scalability: Optimized for vector search, horizontal scaling.
Cost: Pay-per-vector, storage.
AI embeddings, semantic search, recommendation, image/audio similarity
Pinecone, Weaviate, Milvus
Structure: Supports multiple paradigms (doc + graph + key-value)
Strengths: Flexible, can handle complex apps with mixed requirements
Weaknesses: Added complexity, may not be best-in-class for any one model
When NOT to use: If you only need one data model.
Popular Managed Options: ArangoDB Oasis, Azure Cosmos DB, OrientDB Cloud
Performance/Scalability: Varies by model, generally good.
Cost: Pay-per-model, storage.
Complex apps, SaaS, hybrid data needs
ArangoDB, OrientDB, Cosmos DB
'Document_id1_name_Alice'
'Graph Alice->friend->Bob'
'KeyValue_session_abcd_userId1'
'Document_id1_name_Alice'
'Graph Alice->friend->Bob'
'KeyValue_session_abcd_userId1'
Structure: Immutable, append-only, cryptographically verifiable
Strengths: Tamper-proof, auditable, traceable
Weaknesses: Slower writes, complex to manage
When NOT to use: For high-throughput or general-purpose data.
Popular Managed Options: Amazon QLDB, Hyperledger Fabric, BigchainDB
Performance/Scalability: Slower than traditional DBs, optimized for audit.
Cost: Pay-per-write, storage.
Audit logs, financial ledgers, supply chain, compliance
Amazon QLDB, Hyperledger, BigchainDB
Block3 Block2 Block1 Block3 Block2 Block1 HashPrev_0x00 HashPrev_Hash1 Block3 Block2 Block1 Block3 Block2 Block1 HashPrev_0x00 HashPrev_Hash1
Structure: Local, file-based, lightweight
Strengths: No server needed, fast, portable
Weaknesses: Not distributed, limited concurrency
When NOT to use: For distributed or multi-user systems.
Popular Managed Options: SQLite, RocksDB, Realm
Performance/Scalability: Excellent for local, limited for distributed.
Cost: Free/open source.
Mobile apps, IoT devices, desktop software
+File_user_db +Tables_users_orders
+File_user_db +Tables_users_orders
Structure: Relational, distributed, SQL interface
Strengths: ACID + horizontal scaling, global distribution
Weaknesses: Newer, less mature ecosystem
When NOT to use: If you donβt need global scale or distributed SQL.
Popular Managed Options: Google Spanner, CockroachDB Cloud, Yugabyte Cloud
Performance/Scalability: Excellent horizontal scaling, global consistency.
Cost: Pay-per-node, storage.
Global apps, SaaS, fintech, distributed transactions
CockroachDB, Google Spanner, YugabyteDB
Category Structure & Features Strengths Weaknesses Best Use Cases Examples Managed/Cloud Cost Notes Relational (SQL) Tables, fixed schema, ACID Strong consistency, joins, transactions Hard to scale horizontally Banking, ERP, e-commerce PostgreSQL, MySQL, Oracle AWS RDS, Azure SQL Can be expensive at scale Key-Value (NoSQL) Key β Value pairs Super fast, simple, scalable Limited queries Caching, session storage Redis, DynamoDB AWS DynamoDB Pay-per-request Document (NoSQL) JSON-like documents Flexible schema, hierarchical Hard for complex joins CMS, product catalogs MongoDB, Firestore MongoDB Atlas Storage + throughput Wide-Column (NoSQL) Column families Scalable, analytical Complex schema Analytics, recommendations Cassandra, HBase Google Bigtable Pay-per-use Graph (NoSQL) Nodes + Edges Relationships Slower bulk inserts Social graphs, fraud detection Neo4j, ArangoDB Neo4j Aura Pay-per-node/edge Time-Series Indexed by time Optimized for sequential data Narrow use case IoT, metrics, finance InfluxDB, TimescaleDB InfluxDB Cloud Pay-per-ingest Search Indexed text Full-text search Not transactional Product search, logs Elasticsearch, Solr Elastic Cloud Storage + compute Vector High-dimensional vectors Semantic / AI queries Still evolving AI embeddings, similarity Pinecone, Weaviate Pinecone Cloud Pay-per-vector Multi-Model Mix of paradigms Flexible Added complexity Complex apps ArangoDB, OrientDB ArangoDB Oasis Pay-per-model Ledger/Blockchain Immutable, auditable Tamper-proof Slower writes Audit logs, compliance QLDB, Hyperledger Amazon QLDB Pay-per-write Embedded Local, file-based Lightweight, fast Not distributed Mobile, IoT, desktop SQLite, RocksDB - Free NewSQL Distributed SQL ACID + scale Newer tech Global SaaS, fintech CockroachDB, Spanner CockroachDB Cloud Pay-per-node
Scenario Recommended DB(s) Reason Twitter PostgreSQL + Redis + Neo4j SQL for accounts, Redis for caching, Graph DB for connections Netflix Cassandra + MongoDB Wide-Column for analytics, Document DB for metadata Uber PostgreSQL + Redis + TimescaleDB SQL for transactions, Key-Value cache, Time-Series for GPS E-commerce site MySQL + Elasticsearch SQL for inventory/orders, Search DB for product search IoT sensor data InfluxDB, TimescaleDB High write throughput, time-series optimized Real-time chat Redis, MongoDB Fast pub/sub, flexible messages Blockchain explorer PostgreSQL, Elasticsearch SQL for transactions, Search for logs Mobile app local SQLite, Realm Embedded, lightweight
Tip: For most modern systems, a polyglot persistence approach (using multiple DBs for different needs) is common!
Links: