WhatsApp 💬 WhatsApp serves over 2 billion users globally, handling 100+ billion messages daily. This document outlines the comprehensive architecture that enables secure, real-time messaging at massive scale with end-to-end encryption and 99.99% availability.
Connection Servers Persistent Connections
Presence Service Online Status
Push Service Notifications
Connection Servers Persistent Connections
Presence Service Online Status
Push Service Notifications
Accept Connection Client Handshake
Session Management Connection State
Erlang Processes Lightweight Actors
GenServer Connection Handler
Supervisor Process Management
Load Distribution Consistent Hashing
Queue Messages Offline Buffer
Resume Session State Restore
Accept Connection Client Handshake
Session Management Connection State
Erlang Processes Lightweight Actors
GenServer Connection Handler
Supervisor Process Management
Load Distribution Consistent Hashing
Queue Messages Offline Buffer
Resume Session State Restore
Connection Features:
2M+ concurrent connections per server
Erlang/OTP for lightweight processes
Sub-second reconnection
Offline message queuing
Technologies: Erlang, FreeBSD, Noise Protocol
Compose Message Client Side
Encrypt Message E2E Encryption
Send to Server Encrypted Payload
Route Message Recipient Lookup
Deliver Message Push to Client
Acknowledgment Delivery Receipt
Push Notification Wake Client
Media Message Image/Video/Audio
Compose Message Client Side
Encrypt Message E2E Encryption
Send to Server Encrypted Payload
Route Message Recipient Lookup
Deliver Message Push to Client
Acknowledgment Delivery Receipt
Push Notification Wake Client
Media Message Image/Video/Audio
Message Delivery Features:
At-least-once delivery guarantee
30-day offline message storage
Read receipts tracking
Multi-device sync
Retry with exponential backoff
Plaintext Message User Content
AES-256-GCM Symmetric Encryption
HMAC-SHA256 Authentication
Ciphertext Encrypted Payload
Identity Key Long-term Public Key
Signed Pre-Key Medium-term Key
One-time Pre-Keys Single Use
Session Key Per Message Chain
Fetch Recipient Keys Initial Setup
X3DH Protocol Key Agreement
Derive Session Key Shared Secret
DH Ratchet New Keys Per Message
Symmetric Ratchet Chain Key Derivation
Forward Secrecy Past Message Protection
Future Secrecy Compromise Recovery
Plaintext Message User Content
AES-256-GCM Symmetric Encryption
HMAC-SHA256 Authentication
Ciphertext Encrypted Payload
Identity Key Long-term Public Key
Signed Pre-Key Medium-term Key
One-time Pre-Keys Single Use
Session Key Per Message Chain
Fetch Recipient Keys Initial Setup
X3DH Protocol Key Agreement
Derive Session Key Shared Secret
DH Ratchet New Keys Per Message
Symmetric Ratchet Chain Key Derivation
Forward Secrecy Past Message Protection
Future Secrecy Compromise Recovery
Signal Protocol Features:
Forward secrecy (past messages protected)
Future secrecy (recovery from compromise)
Deniability (no proof of sender)
Asynchronous key exchange
Multi-device support
Send to Group Single Encrypt
Server Fanout To All Members
Deliver to Members Individual Push
Standard Group Up to 1024 Members
Broadcast List One-to-Many
Remove Members Admin Action
Admin Management Permissions
Key Rotation On Member Change
Pairwise Encryption For Key Distribution
Send to Group Single Encrypt
Server Fanout To All Members
Deliver to Members Individual Push
Standard Group Up to 1024 Members
Broadcast List One-to-Many
Remove Members Admin Action
Admin Management Permissions
Key Rotation On Member Change
Pairwise Encryption For Key Distribution
Group Features:
Sender Key protocol for efficiency
Single encryption for all members
Key rotation on membership changes
Admin controls and permissions
Initiate Call Caller Action
ICE Candidates NAT Traversal
P2P Connection Direct or Relay
Bandwidth Adaptation Quality Control
Group Call Up to 32 People
STUN Servers NAT Discovery
Quality Monitoring SRTT, Jitter
Initiate Call Caller Action
ICE Candidates NAT Traversal
P2P Connection Direct or Relay
Bandwidth Adaptation Quality Control
Group Call Up to 32 People
STUN Servers NAT Discovery
Quality Monitoring SRTT, Jitter
Calling Features:
Peer-to-peer with TURN fallback
End-to-end encrypted media
Adaptive bitrate streaming
Group calls up to 32 participants
Text Status Background + Text
Fetch Status Contact's Status
Notify Poster View Receipt
My Contacts Except Exclusion List
Only Share With Inclusion List
Text Status Background + Text
Fetch Status Contact's Status
Notify Poster View Receipt
My Contacts Except Exclusion List
Only Share With Inclusion List
User Table Phone, Keys, Settings
Session Table Connection State
Offline Queue Pending Messages
Group Table Group Metadata
Disc Only Copies Large Data
Table Fragmentation Horizontal Partition
Sticky Sessions User Affinity
Distributed Lookup Hash Ring
Dirty Operations Fast Path
User Table Phone, Keys, Settings
Session Table Connection State
Offline Queue Pending Messages
Group Table Group Metadata
Disc Only Copies Large Data
Table Fragmentation Horizontal Partition
Sticky Sessions User Affinity
Distributed Lookup Hash Ring
Dirty Operations Fast Path
Messages Keyspace Offline Messages
Media Keyspace Media Metadata
Status Keyspace Stories Data
messages_by_user PK: user_id CK: timestamp
media_by_message PK: message_id
status_by_user PK: user_id, TTL: 24h
Local One Reads Low Latency
Consistency Levels Per Operation
Messages Keyspace Offline Messages
Media Keyspace Media Metadata
Status Keyspace Stories Data
messages_by_user PK: user_id CK: timestamp
media_by_message PK: message_id
status_by_user PK: user_id, TTL: 24h
Local One Reads Low Latency
Consistency Levels Per Operation
Client Upload Encrypted Media
Store Blob Content-addressed
Thumbnail Generation Preview Images
Compression Size Reduction
Pre-signed URLs Secure Access
Streaming Progressive Download
Client Upload Encrypted Media
Store Blob Content-addressed
Thumbnail Generation Preview Images
Compression Size Reduction
Pre-signed URLs Secure Access
Streaming Progressive Download
Lightweight Processes 2KB per process
Massive Concurrency Millions of processes
Fault Tolerant Let it Crash
Hot Code Loading Zero Downtime Deploy
Application System Component
Distributed Erlang Node Communication
Global Registry Name Resolution
Reductions Fair Scheduling
Per-process GC No Stop-the-world
Binary Handling Reference Counting
Lightweight Processes 2KB per process
Massive Concurrency Millions of processes
Fault Tolerant Let it Crash
Hot Code Loading Zero Downtime Deploy
Application System Component
Distributed Erlang Node Communication
Global Registry Name Resolution
Reductions Fair Scheduling
Per-process GC No Stop-the-world
Binary Handling Reference Counting
US Regions Multiple Datacenters
Edge POP 1 Connection Termination
Edge POP 2 Connection Termination
Edge POP N Connection Termination
Connection Cluster Persistent Connections
Message Cluster Routing & Storage
Media Cluster Blob Storage
Call Cluster Signaling & Relay
US Regions Multiple Datacenters
Edge POP 1 Connection Termination
Edge POP 2 Connection Termination
Edge POP N Connection Termination
Connection Cluster Persistent Connections
Message Cluster Routing & Storage
Media Cluster Blob Storage
Call Cluster Signaling & Relay
Local SQLite Message Cache
Lazy Loading Media on Demand
Compress Upload Reduced Size
Batch Sync Efficient Updates
Binary Protocol Compact Format
Multiplexing Single Connection
ETS Tables In-memory Cache
Message Latency < 200ms P99
Connections 2M+ per server
Local SQLite Message Cache
Lazy Loading Media on Demand
Compress Upload Reduced Size
Batch Sync Efficient Updates
Binary Protocol Compact Format
Multiplexing Single Connection
ETS Tables In-memory Cache
Message Latency < 200ms P99
Connections 2M+ per server
No Message Storage Delivered = Deleted
Metadata Minimization Essential Only
Short Retention 30 Days Max
Last Seen Everyone/Contacts/Nobody
Profile Photo Visibility Control
About Info Visibility Control
Block User No Communication
Report User Abuse Reporting
Rate Limiting Broadcast Limits
Business Label Identification
Product Catalog Business Data
Commerce Data Transaction Info
No Message Storage Delivered = Deleted
Metadata Minimization Essential Only
Short Retention 30 Days Max
Last Seen Everyone/Contacts/Nobody
Profile Photo Visibility Control
About Info Visibility Control
Block User No Communication
Report User Abuse Reporting
Rate Limiting Broadcast Limits
Business Label Identification
Product Catalog Business Data
Commerce Data Transaction Info
Phone Number Auth SMS/Call Verification
Two-Step Verification PIN Code
Biometric Lock App Security
Device Linking Multi-device Auth
End-to-End Encryption Signal Protocol
Encrypted Backups Optional E2E
Local Encryption On-device
TLS 1.3 Transport Security
Forward Limits Misinformation
Ban System Policy Violations
Appeal Process Account Recovery
Security Code Key Verification
QR Code Scan In-person Verify
Key Change Notification Alert
Phone Number Auth SMS/Call Verification
Two-Step Verification PIN Code
Biometric Lock App Security
Device Linking Multi-device Auth
End-to-End Encryption Signal Protocol
Encrypted Backups Optional E2E
Local Encryption On-device
TLS 1.3 Transport Security
Forward Limits Misinformation
Ban System Policy Violations
Appeal Process Account Recovery
Security Code Key Verification
QR Code Scan In-person Verify
Key Change Notification Alert
Message Metrics Volume, Latency
Connection Metrics Count, Churn
Media Metrics Upload, Download
Call Metrics Setup, Quality
Anomaly Detection ML-based
Operations Dashboard Real-time Status
Regional Dashboard Per Region Health
Capacity Dashboard Resource Usage
Structured Logs JSON Format
Distributed Tracing Request Flow
Security Audit Access Logs
Message Metrics Volume, Latency
Connection Metrics Count, Churn
Media Metrics Upload, Download
Call Metrics Setup, Quality
Anomaly Detection ML-based
Operations Dashboard Real-time Status
Regional Dashboard Per Region Health
Capacity Dashboard Resource Usage
Structured Logs JSON Format
Distributed Tracing Request Flow
Security Audit Access Logs
main feature-branch Feature Dev Code Changes Unit Tests Integration Tests Build & Release Canary Deploy Production Rollout main feature-branch Feature Dev Code Changes Unit Tests Integration Tests Build & Release Canary Deploy Production Rollout
Build & Test Erlang Dialyzer
Canary Deployment 1% Users
Monitor Metrics Delivery Rate, Latency
Progressive Rollout Region by Region
Hot Code Rollback No Downtime
Build & Test Erlang Dialyzer
Canary Deployment 1% Users
Monitor Metrics Delivery Rate, Latency
Progressive Rollout Region by Region
Hot Code Rollback No Downtime
Erlang hot code loading : Zero-downtime deployments
Regional rollout : Country-by-country deployment
Canary analysis : Automated metric comparison
Instant rollback : Hot code swap capability
Configuration management : Centralized config distribution
Erlang releases : OTP release handling
Container orchestration : Custom clustering
Network Partition Split Brain
Overload Test Traffic Spike
OTP Supervisor Auto-restart
Network Partition Split Brain
Overload Test Traffic Spike
OTP Supervisor Auto-restart
Erlang "let it crash" : Built-in fault tolerance
Supervisor trees : Automatic process recovery
Network partition testing : Split-brain scenarios
Load testing : Peak traffic simulation
Delivery Metrics Anonymized
Quality Metrics Call Quality
Stream Processing Real-time Analytics
Batch Processing Historical Analysis
Aggregation Privacy-preserving
Spam Detection Account/Content
Abuse Detection Behavioral
Quality Prediction Call Routing
Capacity Planning Demand Forecast
Route Optimization Call Quality
Capacity Management Auto-scaling
Delivery Metrics Anonymized
Quality Metrics Call Quality
Stream Processing Real-time Analytics
Batch Processing Historical Analysis
Aggregation Privacy-preserving
Spam Detection Account/Content
Abuse Detection Behavioral
Quality Prediction Call Routing
Capacity Planning Demand Forecast
Route Optimization Call Quality
Capacity Management Auto-scaling
Spam detection : Account and content spam without reading messages
Abuse prevention : Behavioral pattern detection
Quality optimization : Call routing and media server selection
Capacity planning : Regional demand forecasting
Note : All ML is done on metadata, never on message content
30% 25% 20% 15% 7% 3% WhatsApp Infrastructure Cost Distribution Compute (Erlang Servers) Network & Bandwidth Storage (Media) Database (Mnesia, Cassandra) CDN & Edge Monitoring & Operations 30% 25% 20% 15% 7% 3% WhatsApp Infrastructure Cost Distribution Compute (Erlang Servers) Network & Bandwidth Storage (Media) Database (Mnesia, Cassandra) CDN & Edge Monitoring & Operations
Erlang Efficiency Millions of Processes
High Density More Users per Server
Custom Optimizations Protocol Efficiency
Binary Protocol Minimal Overhead
Compression Data Reduction
Persistent Connections Reduced Handshakes
Message Batching Reduced RTTs
Media Expiration Auto-delete
No Message Storage Delete After Delivery
Media Deduplication Shared Storage
Full Automation No Manual Ops
Self-healing OTP Supervisors
Minimal Overhead Lean Operations
Erlang Efficiency Millions of Processes
High Density More Users per Server
Custom Optimizations Protocol Efficiency
Binary Protocol Minimal Overhead
Compression Data Reduction
Persistent Connections Reduced Handshakes
Message Batching Reduced RTTs
Media Expiration Auto-delete
No Message Storage Delete After Delivery
Media Deduplication Shared Storage
Full Automation No Manual Ops
Self-healing OTP Supervisors
Minimal Overhead Lean Operations
Linked Devices Multi-device Sync
Device Sync Cross-platform
Web Companion Browser Access
Desktop Apps Native Experience
View Once Self-destructing
Disappearing Messages Auto-delete
Private Groups Invite Links
Backup Encryption E2E Cloud Backup
WhatsApp Business Enterprise API
WhatsApp Pay In-chat Payments
AI Chatbots Customer Service
WebRTC Evolution AV1 Codec
MLS Protocol Group Encryption
Interoperability Cross-platform Messages
Quantum-safe Post-quantum Crypto
Linked Devices Multi-device Sync
Device Sync Cross-platform
Web Companion Browser Access
Desktop Apps Native Experience
View Once Self-destructing
Disappearing Messages Auto-delete
Private Groups Invite Links
Backup Encryption E2E Cloud Backup
WhatsApp Business Enterprise API
WhatsApp Pay In-chat Payments
AI Chatbots Customer Service
WebRTC Evolution AV1 Codec
MLS Protocol Group Encryption
Interoperability Cross-platform Messages
Quantum-safe Post-quantum Crypto
WhatsApp's architecture demonstrates expertise in building secure, reliable messaging at unprecedented scale. The system successfully manages:
Massive Scale : 2B+ users, 100B+ messages daily
End-to-End Encryption : Signal Protocol for all messages
High Availability : 99.99% uptime globally
Low Latency : Sub-200ms message delivery
Efficiency : Small engineering team, minimal infrastructure
Erlang/OTP Foundation
Lightweight processes (millions per node)
Fault tolerance (let it crash)
Hot code loading (zero downtime)
Distributed by design
Security First
End-to-end encryption by default
Minimal data retention
Forward and future secrecy
User privacy controls
Simplicity
Single-purpose focus
Minimal features, maximum reliability
Small team, high impact
Avoid premature optimization
Global Scale
Edge presence worldwide
Regional data handling
Offline-first design
Efficient protocols
Data Minimization
No message storage after delivery
Minimal metadata
Short retention periods
Privacy by design
The platform continues to evolve with features like multi-device support, disappearing messages, and business messaging, while maintaining the core principles of security, privacy, and simplicity.
This architecture represents WhatsApp's known systems and best practices. Actual implementation details may vary.