NoSQL for Strong Senior Developers
What are the main types of NoSQL databases and how do they differ?
NoSQL databases fall into four main categories:
Document (e.g., MongoDB) — stores JSON-like documents, flexible schema.
Key-Value (e.g., Redis, DynamoDB) — simple key-value pairs, extremely fast for
lookups.
Column-Family (e.g., Cassandra, HBase) — stores data in columns for high write
throughput and wide rows.
Graph (e.g., Neo4j) — optimized for traversing and analyzing relationships.
Each type is optimized for different use cases and data access patterns.
How would you choose between MongoDB, Cassandra, and Redis for a new
system?
MongoDB:
Good for flexible documents, semi-structured data, rich querying.
Supports ACID at document level.
Cassandra:
Optimized for high-volume writes and availability.
Eventual consistency with tunable consistency levels.
Redis:
In-memory key-value store — extremely fast.
Used for caching, pub/sub, leaderboards, session storage.
Decision factors:
Access patterns (read-heavy, write-heavy, key-value vs document vs time-series).
Consistency and durability requirements.
Latency and throughput targets.
How does consistency differ between RDBMS and NoSQL databases?
RDBMS:
Strong consistency (ACID) enforced by default.
NoSQL:
Many use eventual consistency for higher availability and partition tolerance (CAP
theorem).
Some provide tunable consistency (e.g., Cassandra, DynamoDB).
Trade-off:
Stronger consistency → reduced availability in network partitions.
Eventual consistency → better availability but requires conflict resolution.
What are common pitfalls when modeling data in NoSQL, and how do you avoid
them?
Pitfalls:
Trying to normalize data as in RDBMS — leads to excessive joins or queries.
Underestimating impact of access patterns — NoSQL models are query-driven.
Inflexible document structure (e.g., inconsistent field usage in MongoDB).
Best practices:
Design for denormalization and embed where practical.
Model based on read patterns, not entity relationships.
Use schema validation (e.g., JSON schema) when needed.
How do sharding and replication work in NoSQL databases?
Sharding:
Horizontal partitioning of data across nodes.
Improves scalability and throughput.
Replication:
Creates copies of data across nodes for fault tolerance and read scalability.
Examples:
MongoDB — supports automatic sharding and replica sets.
Cassandra — uses consistent hashing with tunable replication.
Considerations:
Shard key selection is critical for load balancing.
Replication strategy affects consistency and availability.