45+ DBMS Interview Questions and Answers 2026

DBMS interview questions, basic dbms interview questions, and common dbms interview questions show up in campus placements, backend screening rounds, and data roles before vendors like PostgreSQL or Oracle dominate the conversation. Interviewers expect you to explain DBMS vs RDBMS, walk through ER modeling and keys, justify normal forms, describe ACID and isolation levels, compare index types, and reason about SQL vs NoSQL trade-offs—not only write SELECT statements.

Below are 45 questions with elaborate answers; technical sections include a strong answer sample you can say aloud. Pair with SQL technical interview questions for live query writing, PostgreSQL interview questions for engine-specific depth, MongoDB interview questions for document-store contrast, data science interview questions for analytics pipelines, Python developer interview questions for ORM context, and full stack developer interviews when system design ties app tiers to persistence.

NOTE

Prep target: Normalize a messy denormalized table on a whiteboard, explain ACID with a transfer example, and defend when you would break 3NF for performance—out loud, as if in a mid-level loop.

Tested on: Ubuntu 25.04 (Plucky Puffin); kernel 6.14.0-37-generic; PostgreSQL 17.7 for transaction and constraint examples ({run=false} SQL).

Interview context and how to prepare

What do DBMS interviews actually test?

DBMS interviews test foundational database theory that every application engineer should know—even if daily work hides it behind an ORM.

Layer	What interviewers probe
Concepts	DBMS vs RDBMS, schema vs instance, three-schema architecture
Modeling	ER diagrams, keys, mapping to tables
Design	Normalization (1NF–BCNF), when to denormalize
Transactions	ACID, isolation levels, deadlocks
Physical layer	Indexes, query plans at a high level
Architecture	Replication, backup types, SQL vs NoSQL

Role	Emphasis
Campus / junior	ER model, normal forms, SQL basics
Backend	Transactions, indexing rationale, ORM pitfalls
Data / analytics	Star schema, denormalization, ETL idempotency
DBA track	Concurrency, recovery, backup/restore

Campus and experienced loops both expect you to explain normalization, keys, and ACID with examples—not only memorize definitions.

DBMS vs RDBMS — what is the difference?

A DBMS (Database Management System) is software that lets users define, store, retrieve, and manage data with controlled access. The term is broad—it can include hierarchical, network, object-oriented, or document systems.

An RDBMS (Relational DBMS) is a DBMS that stores data in tables (relations) with rows and columns, enforces relational integrity via keys and constraints, and uses SQL (or SQL-like languages) for access.

Aspect	DBMS (general)	RDBMS
Data model	Various (files, graphs, documents)	Tables / relations
Relationships	Model-dependent	Foreign keys, joins
Query language	Varies	SQL standard (mostly)
Examples	MongoDB, Redis, Neo4j	PostgreSQL, MySQL, Oracle

Every RDBMS is a DBMS; not every DBMS is relational. In interviews, say RDBMS when discussing joins, normalization, and ACID on tabular schemas; say DBMS when comparing storage paradigms broadly.

DBMS theory for developers vs DBAs?

Topic	Developer loop	DBA loop
Schema	ER → tables, migrations	Capacity, partitioning
Normalization	Avoid update anomalies in app schema	Rarely redesigns live OLTP casually
Transactions	Boundaries in code, retry on deadlock	Lock monitoring, long transactions
Indexes	"We need one for this WHERE clause"	Index bloat, maintenance windows
Backup	Assumes restore works	RPO/RTO, PITR, drill restores
HA	Connection strings, read replicas	Failover, replication lag

Basic dbms interview questions skew toward theory (normal forms, keys). Senior backend loops add isolation anomalies and index trade-offs. DBA tracks go deeper on recovery and replication—see PostgreSQL interview questions for engine-specific depth.

What is a typical DBMS interview loop?

Round	Duration	Focus
Aptitude / screening	30 min	SQL basics, DBMS definitions
Theory	45 min	ER, normalization, ACID
SQL coding	45–60 min	Joins, aggregates—see SQL prep
System design (senior)	45 min	SQL vs NoSQL, caching, sharding
Scenario	30 min	"Design a library / e-commerce schema"

Campus rounds often ask draw ER diagram for university registration; product companies add transaction isolation and index choice for backend roles.

What is a realistic 3–5 week DBMS prep plan?

Week	Focus	Output
1	DBMS vs RDBMS, ER model, keys	Draw ER for library or hospital system
2	1NF–BCNF with examples	Normalize a denormalized order table
3	ACID, isolation, deadlocks	Explain transfer + phantom read
4	Indexes, query processing basics	B-tree vs hash; when index hurts INSERT
5	SQL vs NoSQL, CAP, mock scenarios	Defend Postgres vs Mongo for one use case

Practice 10 SQL problems weekly from SQL technical interview questions. Run transactions locally on PostgreSQL to see COMMIT / ROLLBACK behavior.

Database fundamentals

Data vs information?

Data are raw facts—numbers, strings, timestamps—without context. Information is data processed into a meaningful form for decisions.

Example	Data	Information
Retail	`42`, `SKU-9`, `2026-06-27`	"42 units of SKU-9 sold today—reorder threshold breached"
Logs	`HTTP 500` at `14:02`	"Error rate spiked 3× after deploy at 14:00"

A DBMS stores data; applications and reports turn it into information via constraints, relationships, and queries. Metadata (schema, statistics) describes the data itself.

A strong answer is:

Data are raw values; information is data with context and meaning—a DBMS stores data; queries and business rules produce information for decisions.

Why use a database instead of files?

Flat files work for simple scripts; production systems need structured concurrent access.

Concern	Files	DBMS
Concurrency	Race conditions, manual locking	Transaction manager, locks
Integrity	App must enforce rules	Constraints, foreign keys
Query	Parse entire file	Indexed lookup, optimizer
Security	OS file permissions	Roles, grants, RLS (engine-dependent)
Recovery	Manual backups	WAL, point-in-time recovery
Schema evolution	Breaking changes fragile	Migrations, versioning

Files remain fine for logs, CSV exports, and static assets; OLTP and shared mutable state belong in a DBMS.

A strong answer is:

Files lack concurrent safe updates, declarative integrity, and efficient indexed queries—a DBMS gives transactions, constraints, and recovery that files force every app to reinvent.

Schema vs instance vs database?

Term	Meaning
Database	Organized collection of related data managed by DBMS
Schema	Logical structure—tables, columns, constraints, views (the design)
Instance	Actual data stored at a moment in time (the content)

Analogy: schema is the blueprint of a building; instance is the furniture and occupants inside.

One DBMS server can host multiple databases (e.g. postgres, myapp_prod). Each database has its own schema namespace. Catalog / system tables store metadata about schemas.

A strong answer is:

Schema is the logical design—tables and constraints; instance is the live data at a point in time; I do not confuse CREATE TABLE (schema) with INSERT (instance).

Three-schema architecture?

ANSI/SPARC three-level architecture separates concerns so storage changes do not break applications.

Level	Name	What it describes	Audience
External	View level	User-specific views	App developers, analysts
Conceptual	Logical level	Entities, relationships, constraints	Data modelers
Internal	Physical level	Files, indexes, allocation	DBAs, engine

Mapping layers translate between levels. Logical data independence: change conceptual schema with minimal external view changes. Physical data independence: move indexes or storage without rewriting apps.

A strong answer is:

Three-schema architecture separates user views, logical ER model, and physical storage—so I can add an index or split a table without rewriting every application query.

Major DBMS components?

Component	Role
DDL processor	Parses CREATE/ALTER/DROP; updates catalog
DML processor	Runs INSERT/UPDATE/DELETE/SELECT
Query processor	Parser → optimizer → execution plan
Storage manager	Pages, buffers, disk I/O
Transaction manager	ACID, commit/abort, isolation
Recovery manager	WAL, checkpoint, crash recovery
Concurrency control	Locks, timestamps, MVCC (engine-specific)

Understanding the pipeline helps in interviews: "Why is this query slow?" → parse OK, bad plan or missing index or lock wait.

A strong answer is:

I name the query processor, storage manager, transaction manager, and recovery manager—then tie slow queries to optimizer plans and lock waits, not mysticism.

Common data models in DBMS?

Model	Structure	Example systems
Relational	Tables, keys	PostgreSQL, MySQL
Document	JSON/BSON documents	MongoDB
Key-value	Opaque key → value	Redis, DynamoDB
Column-family	Wide columns per row key	Cassandra, HBase
Graph	Nodes and edges	Neo4j
Hierarchical / Network	Legacy models	IMS, IDMS (historical)

Relational dominates OLTP with strong consistency needs. Document fits flexible schemas; graph fits relationship-heavy traversals. Choice depends on access patterns, not hype.

A strong answer is:

Relational for tabular integrity and joins; document for flexible nested records; graph for deep relationship traversals—I pick by query pattern, not buzzwords.

DDL, DML, DCL, and TCL?

SQL command categories:

Category	Purpose	Examples
DDL (Data Definition)	Schema structure	`CREATE`, `ALTER`, `DROP`, `TRUNCATE`
DML (Data Manipulation)	Data changes	`INSERT`, `UPDATE`, `DELETE`, `SELECT`
DCL (Data Control)	Permissions	`GRANT`, `REVOKE`
TCL (Transaction Control)	Transaction boundaries	`BEGIN`, `COMMIT`, `ROLLBACK`, `SAVEPOINT`

TRUNCATE is DDL-like—fast, often cannot roll back, resets identity. DELETE is DML—row-level, logged, transactional.

sql


BEGIN;
INSERT INTO accounts (id, balance) VALUES (1, 100);
ROLLBACK;  -- row not visible after rollback

A strong answer is:

DDL defines schema; DML changes rows; DCL grants access; TCL wraps units of work—I use COMMIT/ROLLBACK so partial updates never leave inconsistent balances.

ER model and relational mapping

What is the ER model?

The Entity-Relationship (ER) model is a conceptual diagram notation for database design before tables exist.

Element	Meaning
Entity	Object type (Student, Course, Order)
Attribute	Property (name, date, amount)
Relationship	Association between entities (enrolls, contains)
Cardinality	1:1, 1:N, M:N

Weak entity depends on another entity for identity (OrderLine depends on Order). Derived attribute computed from others (age from birthdate)—often not stored.

ER diagrams communicate with stakeholders before migration scripts. Mapping rules convert ER → relational tables.

A strong answer is:

ER model is conceptual—entities, attributes, relationships, cardinality—I draw it first with stakeholders, then map to tables with keys before writing DDL.

Cardinality: 1:1, 1:N, M:N?

Cardinality	Meaning	Relational mapping
1:1	One A ↔ one B	Foreign key on either side (or merge if always together)
1:N	One A → many B	FK on the many side
M:N	Many A ↔ many B	Junction table with two FKs

Example M:N: Students enroll in Courses → enrollments(student_id, course_id, grade).

Participation: total (every entity must participate) vs partial. Optional relationship → nullable FK.

A strong answer is:

1:N puts the foreign key on the many side; M:N needs a junction table with both keys—I state cardinality before writing CREATE TABLE.

How do you map ER to relational tables?

ER construct	Relational table
Strong entity	Table; PK on identifier
Weak entity	Table; PK includes owner FK
1:N relationship	FK on N-side
M:N relationship	Junction table + composite PK
Multivalued attribute	Separate table (entity_id, value)
1:1 relationship	FK on one side; unique constraint

Example — university:

students(id, name)
courses(id, title)
enrollments(student_id, course_id, semester) PK (student_id, course_id)

Avoid NULL-heavy designs when relationship attributes exist—junction table carries enrollment date, grade, etc.

A strong answer is:

Each entity becomes a table; 1:N is a foreign key on the many side; M:N is a junction table—I move relationship attributes onto the table that represents the association.

Entity set vs relationship set?

An entity set is a collection of entities of the same type—all Employee rows share attributes like emp_id, name.

A relationship set is a collection of relationship instances linking entities—Works_In(emp, dept) pairs employees with departments.

In relational terms:

Entity set → table
Relationship set → FK columns or junction table

Relationship attributes (e.g. since_date on Works_In) belong on the relationship representation—not duplicated on both entity tables.

A strong answer is:

Entity set is a type of object; relationship set is associations between them—in SQL, relationships become foreign keys or a bridge table with their own attributes.

Weak entity and identifying relationship?

A weak entity cannot be uniquely identified by its own attributes alone—it depends on a owner entity.

Example: InvoiceLine identified by (invoice_id, line_no)—line_no is unique only per invoice.

Concept	Role
Owner entity	Strong entity (Invoice)
Identifying relationship	Links weak to owner
Partial key	Discriminator within owner scope (line_no)

DDL: composite primary key (invoice_id, line_no) with invoice_id FK → invoices(id) ON DELETE CASCADE often appropriate.

A strong answer is:

Weak entities need the owner's key in their primary key—InvoiceLine uses (invoice_id, line_no) because line numbers repeat across invoices.

Keys, constraints, and integrity

Super key, candidate key, primary key, alternate key?

Key type	Definition
Super key	Any set of attributes that uniquely identifies a row
Candidate key	Minimal super key—no proper subset is also a super key
Primary key	Chosen candidate key for the table
Alternate key	Candidate key not selected as PK (enforced UNIQUE)
Foreign key	References PK (or unique) in another table

Example employees:

Super keys: {emp_id}, {emp_id, name}, {email} (if unique)
Candidate keys: {emp_id}, {email}
Primary key: emp_id
Alternate key: email UNIQUE

A strong answer is:

Candidate keys are minimal unique identifiers; primary key is the main one; alternate keys stay unique—foreign keys reference parent primary or unique keys.

Composite key vs surrogate key?

Composite key: primary key spans multiple columns—natural for junction tables (order_id, product_id).

Surrogate key: artificial identifier (SERIAL, UUID) with no business meaning.

Approach	Pros	Cons
Natural / composite	Matches business rules	Wide FKs; key changes are painful
Surrogate	Stable, narrow FKs	Extra column; not business-meaningful

Use surrogate when natural keys are long, composite, or unstable (username changes). Keep natural unique constraints where business requires (email UNIQUE).

A strong answer is:

Composite keys fit junction tables; surrogate keys simplify joins when natural keys are wide or unstable—I still enforce business uniqueness with alternate constraints.

Foreign key constraints and referential actions?

A foreign key enforces referential integrity: child values must exist in parent (or be NULL if allowed).

sql


CREATE TABLE orders (
  id         int PRIMARY KEY,
  customer_id int NOT NULL REFERENCES customers(id)
    ON DELETE RESTRICT
    ON UPDATE CASCADE
);

Action	ON DELETE behavior
RESTRICT / NO ACTION	Block delete if children exist
CASCADE	Delete children too
SET NULL	Null out FK in children
SET DEFAULT	Set default value

RESTRICT protects accidental orphan deletes in OLTP. CASCADE suits strict ownership (line items die with order).

A strong answer is:

Foreign keys keep child rows honest—I default to ON DELETE RESTRICT for safety and CASCADE only when child rows have no meaning without the parent.

Types of integrity constraints?

Type	Enforces
Entity integrity	Primary key unique and NOT NULL
Referential integrity	FK values match parent or NULL
Domain integrity	Valid types, CHECK, ENUM
User-defined / business	Custom rules (triggers, CHECK)

sql


CREATE TABLE products (
  id    int PRIMARY KEY,
  price numeric CHECK (price >= 0),
  sku   text UNIQUE NOT NULL
);

ORMs map these to validators—but database constraints are the last line of defense against buggy deploys.

A strong answer is:

Entity, referential, and domain integrity—PKs, foreign keys, and CHECK constraints—I put non-negotiable rules in the database, not only in application code.

NULL in relational databases?

NULL means unknown or not applicable—not zero, not empty string.

Expression	Result
`NULL = NULL`	Unknown (not TRUE)
`NULL AND TRUE`	NULL
`COUNT(*)`	Counts all rows including NULLs
`COUNT(column)`	Ignores NULLs

Use IS NULL / IS NOT NULL. Three-valued logic (TRUE, FALSE, UNKNOWN) affects WHERE filters—rows with UNKNOWN do not match.

Design tip: use NULL for optional fields; avoid NULL in PKs; consider NOT NULL DEFAULT when a sentinel is clearer.

A strong answer is:

NULL is unknown, not zero—comparisons use IS NULL; I avoid NULL primary keys and explain three-valued logic when debugging surprising WHERE results.

Normalization

Why normalize?

Normalization decomposes tables to reduce redundancy and update anomalies.

Anomaly	Problem
Insert	Cannot add data without unrelated data
Update	Must change many rows for one fact
Delete	Removing one fact loses unrelated data

Example denormalized order_lines(customer_name, customer_city, product, qty):

Update customer city in every row they ordered
Delete last order line loses customer address

Normalization splits into customers, products, orders, order_lines.

Trade-off: more joins at read time—sometimes denormalize deliberately for reporting (see Q28).

A strong answer is:

Normalization removes redundancy so updates happen in one place—I explain insert/update/delete anomalies on a messy table, then decompose it.

First normal form (1NF)?

A table is in 1NF when:

Each column holds atomic (indivisible) values
Each row is unique (has a key)
No repeating groups in one column

Violations:

phone_numbers: "555-1, 555-2" → split to customer_phones table
Multiple course1, course2 columns → row per enrollment

1NF is the baseline—almost all RDBMS tables satisfy it by design.

A strong answer is:

1NF means atomic values and no repeating groups—I split multi-valued columns into separate rows or tables instead of comma-separated lists.

Second normal form (2NF)?

A table is in 2NF when:

It is in 1NF
Every non-key attribute depends on the whole primary key—not a proper subset (no partial dependency)

Applies when PK is composite. Example violation:

enrollments(student_id, course_id, student_name, course_title, grade)

student_name depends only on student_id (partial)
course_title depends only on course_id (partial)

Fix: move student_name → students; course_title → courses; keep grade on enrollments.

A strong answer is:

2NF removes partial dependencies on composite keys—I move attributes that depend on only part of the key into their own tables.

Third normal form (3NF)?

A table is in 3NF when:

It is in 2NF
No non-key attribute depends on another non-key attribute (transitive dependency)

Example violation:

employees(emp_id, dept_id, dept_name, dept_location)

dept_name and dept_location depend on dept_id, not directly on emp_id

Fix: departments(dept_id, dept_name, dept_location) + employees(emp_id, dept_id, ...).

Most OLTP schemas target 3NF unless reporting needs dictate otherwise.

A strong answer is:

3NF eliminates transitive dependencies—I pull attributes that depend on non-key columns into separate tables like departments separate from employees.

BCNF vs 3NF?

BCNF (Boyce-Codd Normal Form) strengthens 3NF:

Every determinant (attribute that determines others) must be a candidate key.

Form	Rule
3NF	Non-key attrs depend only on keys; exceptions for overlapping keys
BCNF	Stricter—every functional dependency left side is a super key

3NF but not BCNF example (classic):

student_advisor(student, subject, advisor) where each student has one advisor per subject, and each advisor teaches one subject → decompose further.

In practice 3NF ≈ BCNF for most business schemas. BCNF matters in academic questions and odd dependency patterns.

A strong answer is:

BCNF says every determinant is a candidate key—stricter than 3NF; I know the classic student-advisor example but most production schemas stop at 3NF.

When is denormalization justified?

Denormalization intentionally adds redundancy for read performance or simpler queries.

Use case	Technique
Reporting / OLAP	Star schema, pre-aggregated facts
Read-heavy dashboards	Cached counters (`order_count` on customer)
Reduce join cost	Duplicate display names on fact tables
Document stores	Embed related records (MongoDB)

Costs: update anomalies return—you must sync denormalized copies (triggers, batch jobs, CDC).

Rule: normalize first, measure slow queries, denormalize with a documented consistency strategy.

A strong answer is:

I normalize OLTP to 3NF first, then denormalize only with metrics—star schemas for analytics or cached counts with a clear update path, not upfront duplication.

Functional dependency (FD)?

A functional dependency X → Y means: if two rows agree on X, they must agree on Y.

Notation: student_id → student_name

FD type	Example
Trivial	`(emp_id, name) → emp_id`
Non-trivial	`emp_id → name`
Full	Depends on whole composite key (2NF)
Partial	Depends on subset of composite key (2NF violation)
Transitive	`A → B → C` where B not a key (3NF violation)

Normalization algorithms decompose tables using FDs. Interviewers may ask you to list FDs from a table definition.

A strong answer is:

Functional dependencies describe what determines what—I list them from a schema, then use them to justify 2NF and 3NF decomposition steps.

Scenario: Normalize this order table?

Given (denormalized):

orders(order_id, customer_name, customer_email, product_name, qty, unit_price)

Problems:

Customer data repeats per line → update anomaly on email change
Product name repeats → rename product touches many rows
Cannot list products without an order (insert anomaly)

3NF decomposition:

text


customers(customer_id PK, name, email)
products(product_id PK, name, unit_price)
orders(order_id PK, customer_id FK, order_date)
order_lines(order_id FK, product_id FK, qty, line_total)

line_total may be derived (qty * unit_price) or stored for historical pricing.

A strong answer is:

I split customers, products, orders, and order_lines—removing partial and transitive dependencies so customer email updates happen once in customers.

Transactions, ACID, and concurrency

Explain ACID properties?

ACID guarantees reliable transactions:

Property	Meaning	Example
Atomicity	All or nothing	Transfer debit + credit both commit or both roll back
Consistency	Valid state → valid state	Constraints hold after transaction
Isolation	Concurrent txs appear serial	No dirty reads (at proper level)
Durability	Committed data survives crash	WAL flushed to disk

sql


BEGIN;
UPDATE accounts SET balance = balance - 100 WHERE id = 1;
UPDATE accounts SET balance = balance + 100 WHERE id = 2;
COMMIT;

If second UPDATE fails, ROLLBACK undoes the debit—atomicity. Consistency is partly app + constraint responsibility; engine provides mechanisms.

A strong answer is:

ACID: transfer example for atomicity, constraints for consistency, isolation levels for concurrent reads, WAL for durability—I tie each letter to a concrete failure mode.

Transaction states?

State	Description
Active	Executing statements
Partially committed	Final statement done; commit not yet durable
Committed	Changes permanent
Failed	Normal execution cannot proceed
Aborted	Rolled back; DB unchanged by this tx
Terminated	Left the system

BEGIN starts active state. COMMIT → committed → terminated. Error or ROLLBACK → aborted → terminated.

Savepoints allow partial rollback inside a transaction: SAVEPOINT sp1; ... ROLLBACK TO sp1;

A strong answer is:

Active until COMMIT or ROLLBACK—failed transactions abort; I mention savepoints when a long transaction needs partial undo without full rollback.

Isolation levels and anomalies?

SQL standard isolation levels (lowest to highest):

Level	Dirty read	Non-repeatable read	Phantom read
READ UNCOMMITTED	Possible	Possible	Possible
READ COMMITTED	No	Possible	Possible
REPEATABLE READ	No	No	Possible*
SERIALIZABLE	No	No	No

*PostgreSQL REPEATABLE READ also prevents phantoms via MVCC snapshot; MySQL InnoDB behavior varies—know your engine.

Anomaly	What happens
Dirty read	Read uncommitted data from another tx
Non-repeatable read	Same row read twice, different values
Phantom read	Same query returns different row sets

Default on PostgreSQL is READ COMMITTED—good balance for most OLTP.

A strong answer is:

I compare isolation levels with the dirty/non-repeatable/phantom table—READ COMMITTED for most apps, SERIALIZABLE when financial correctness demands it despite retries.

Concurrency control techniques?

Technique	Idea
Lock-based	Shared (read) vs exclusive (write) locks
Two-phase locking (2PL)	Grow locks then shrink—guarantees serializability
MVCC	Readers see snapshot; writers create new versions (PostgreSQL)
Optimistic	Read freely; validate on commit (version column)
Pessimistic	`SELECT FOR UPDATE` locks early

Lost update without control: two txs read balance 100, both write 90—should be 80.

Fix: row-level lock, atomic UPDATE balance = balance - 10, or optimistic retry.

A strong answer is:

Lock-based 2PL, MVCC snapshots, or optimistic versioning—I pick pessimistic locks for contested balances and MVCC for read-heavy Postgres workloads.

What is deadlock and how is it handled?

Deadlock: cycle of transactions each waiting for a lock held by another.

text

Tx A: lock row 1 → wait row 2
Tx B: lock row 2 → wait row 1

Detection: DBMS builds wait-for graph; cycle → abort victim (youngest tx, least work).

Prevention:

Lock rows in consistent order (always id ascending)
Keep transactions short
Use lower isolation when safe
Retry on deadlock error (40001 in PostgreSQL)

sql


-- App pattern: retry up to N times on deadlock
BEGIN;
SELECT * FROM accounts WHERE id IN (1,2) ORDER BY id FOR UPDATE;
-- transfer logic
COMMIT;

A strong answer is:

Deadlock is a lock cycle—the DBMS kills a victim transaction; I prevent it by locking in consistent order, keeping txs short, and retrying on deadlock errors in app code.

Scenario: Safe money transfer transaction?

Requirements: debit and credit must be atomic; balance never negative; concurrent transfers safe.

sql


BEGIN;

SELECT balance FROM accounts WHERE id = 1 FOR UPDATE;
SELECT balance FROM accounts WHERE id = 2 FOR UPDATE;

UPDATE accounts SET balance = balance - 100 WHERE id = 1 AND balance >= 100;
UPDATE accounts SET balance = balance + 100 WHERE id = 2;

-- verify row counts / balances; ROLLBACK if insufficient funds
COMMIT;

Steps to mention:

BEGIN explicit transaction
Lock both rows (FOR UPDATE) in deterministic order
Check balance before debit
COMMIT or ROLLBACK on any failure
Idempotent client token for retries (avoid double transfer)

Isolation: READ COMMITTED + row locks usually enough; SERIALIZABLE for strict audit systems with retry logic.

A strong answer is:

BEGIN, lock both accounts in id order with FOR UPDATE, check balance, debit and credit in one transaction, COMMIT or ROLLBACK—I add idempotency keys for client retries.

Indexing and query processing

Index types in DBMS?

Index	Use case
B-tree (default)	Range queries, sorting, equality
Hash	Equality only; no range
Bitmap	Low-cardinality columns (warehouse)
Full-text	Text search
Composite	Multi-column WHERE / ORDER BY
Covering	INCLUDE columns avoid table lookup
Partial	Subset of rows (active = true)

Clustered (InnoDB PK): data rows stored in index order. Non-clustered: separate structure pointing to rows.

Indexes speed SELECT; slow INSERT/UPDATE/DELETE (maintain index).

A strong answer is:

B-tree for most OLTP equality and range; composite indexes match leftmost WHERE columns—I mention write overhead and avoid indexing every column.

Clustered vs non-clustered index?

	Clustered	Non-clustered
Data layout	Table rows in index order	Index separate from heap
Per table	Usually one (PK)	Many allowed
Lookup	Range scan on PK is fast	Extra hop to heap
Examples	InnoDB PK, SQL Server clustered	PostgreSQL (all indexes non-clustered)

PostgreSQL: table is heap; even PK index is non-clustered—CLUSTER command reorders physically (one-time).

Choose clustered key carefully—it is the physical order (often narrow, monotonic id).

A strong answer is:

Clustered index is the table order—one per table on SQL Server/InnoDB; PostgreSQL uses heap storage with separate indexes—I do not confuse logical PK with physical clustering.

Query processing pipeline?

Stages when you run SELECT:

Parser — syntax check, build parse tree
Validator — tables/columns exist; type check
Query rewriter — views expanded, rule-based transforms
Optimizer — cost-based plan (seq scan vs index scan, join order)
Execution engine — runs plan, returns rows

Statistics (row counts, histograms) feed the optimizer—stale stats → bad plans.

Interview link: PostgreSQL EXPLAIN for engine-specific plan reading.

A strong answer is:

Parse, validate, optimize, execute—I say the optimizer picks plans from statistics, so stale stats cause slow queries even with indexes present.

Scenario: Which index for this query?

Query:

sql


SELECT * FROM orders
WHERE customer_id = 42 AND status = 'open'
ORDER BY created_at DESC
LIMIT 20;

Reasoning:

Option	Verdict
Index on `(customer_id)`	Good filter; may sort large set
Index on `(customer_id, status, created_at DESC)`	Best—filter + order without sort
Index on `(status)` alone	Weak—low selectivity if few statuses

Partial index if most queries want status = 'open':

sql


CREATE INDEX idx_open_customer ON orders (customer_id, created_at DESC)
WHERE status = 'open';

Mention write amplification—only add indexes queries prove they need.

A strong answer is:

Composite index on (customer_id, status, created_at DESC) matches filter and ORDER BY—a partial index on open orders if that is the hot path; I verify with EXPLAIN, not guessing.

SQL vs NoSQL, distributed DB, and final prep

SQL vs NoSQL — when which?

Factor	SQL (RDBMS)	NoSQL
Schema	Fixed, enforced	Flexible, evolving
Relationships	Joins, FKs	Embed or application joins
Transactions	Strong ACID (default)	Varies (eventual common)
Scale-out	Vertical + read replicas; sharding harder	Partitioning built-in
Query	Ad hoc SQL	Model-specific APIs

Choose SQL when: complex relationships, strong consistency, reporting joins, financial correctness.

Choose NoSQL when: flexible schema, massive horizontal scale, simple access patterns (key lookup), document shape matches app (see MongoDB interview questions).

Many systems use both—Postgres for source of truth, Redis for cache, Elasticsearch for search.

A strong answer is:

SQL for relational integrity and ad hoc joins; NoSQL for flexible schema and horizontal scale—I pick by access pattern and consistency needs, not religion.

CAP theorem — practical meaning?

CAP: in a network partition, you choose between Consistency and Availability (Partition tolerance is mandatory in distributed systems).

Choice	Behavior during partition
CP	Refuse writes/reads to stay consistent (some nodes unavailable)
AP	Accept requests; replicas may diverge (eventual consistency)

Not "pick two of three" in normal operation—only during partition.

Examples:

Traditional RDBMS primary: often CP-ish for writes
Cassandra: tunable; AP leaning
Redis Cluster: configuration-dependent

PACELC extension: else (no partition), choose Latency vs Consistency.

A strong answer is:

During a partition you trade consistency for availability or vice versa—I explain CP vs AP with a concrete system, not as a buzzword triangle.

Replication and backup types?

Replication — copies data for read scale or HA:

Type	Mechanism	Use
Primary-replica	Async/sync copy	Read scaling, failover
Multi-primary	Bidirectional (conflict risk)	Geo write
Logical	Row/change events	CDC, partial sync

Backup:

Type	Contents	Recovery
Full	Entire database	Baseline restore
Incremental	Changes since last backup	Faster backup; chain restore
PITR	WAL/archive logs	Restore to timestamp

RPO (how much data loss) and RTO (downtime) drive strategy. Test restore drills—backups untested are wishes.

A strong answer is:

Primary-replica for HA and read scale; full plus WAL for PITR—I define RPO/RTO and insist on tested restore, not just backup jobs running.

Views, stored procedures, and triggers?

Object	Purpose	Caveat
View	Saved query; virtual table	Updatable views have restrictions
Stored procedure	Logic in DB	Vendor-specific; harder to test in CI
Trigger	Auto-run on INSERT/UPDATE/DELETE	Hidden side effects; debug pain

Views simplify security (GRANT on view not base table) and encapsulate joins.

Triggers for audit logs work; avoid heavy business logic that belongs in application services.

Materialized view stores results—refresh on schedule for dashboards.

A strong answer is:

Views encapsulate queries and permissions; procedures bundle DB-side logic; triggers for audit—I avoid trigger spaghetti that duplicates app business rules.

Final DBMS interview checklist?

DBMS vs RDBMS crisp definition
Draw ER diagram with cardinality
1NF → 3NF with anomalies explained
Keys: PK, FK, candidate, surrogate
ACID with transfer example
Isolation levels anomaly table
Deadlock prevention + retry
B-tree index rationale
SQL vs NoSQL trade-offs
CAP during partition
One normalization scenario on whiteboard
SQL technical interviews for live coding
PostgreSQL if engine-specific loop
MongoDB for document contrast

A strong answer is:

I rehearse ER + normalization weekly, explain ACID on a transfer, and pair theory here with SQL practice from the SQL guide—not campus notes without queries.

Pattern cheat sheet (quick reference)

Task	DBMS approach
Unique row identity	Primary key; surrogate if natural key unstable
Parent-child integrity	Foreign key + deliberate ON DELETE action
Remove redundancy	Normalize to 3NF; denormalize only with sync plan
Money movement	Transaction + row locks + balance check
Hot filtered query	Composite or partial index matching WHERE/ORDER BY
Concurrent reads	MVCC (Postgres) or isolation level choice
Deadlock	Lock ordering + short txs + app retry
Flexible schema	Document store or JSONB column (engine-dependent)
Analytics	Star schema / materialized views
HA	Replication + tested backup restore

References

DBMS interview prep

On-site prep

Summary

DBMS interviews test ER modeling, normal forms, ACID, and indexing trade-offs with concrete examples—not definition memorization alone. Normalize a messy table on a whiteboard, walk through a safe transfer transaction, and defend SQL vs NoSQL for a real product. Pair theory here with SQL technical interviews, PostgreSQL interview questions, and MongoDB interview questions when the loop goes deeper.

Interview context and how to prepare

Database fundamentals

ER model and relational mapping

Keys, constraints, and integrity

Normalization

Transactions, ACID, and concurrency

Indexing and query processing

SQL vs NoSQL, distributed DB, and final prep

Pattern cheat sheet (quick reference)

References

Summary

Related Articles

PostgreSQL Interview Questions and Answers

SQL Technical Interview Questions and Answers

How to Install Redis on Ubuntu

Search GoLinuxCloud