SQL Interview Questions: The 30 You'll Actually Get Asked
Last updated: March 2026
I have sat on both sides of SQL interviews — as a candidate who once blanked on the difference between INNER JOIN and LEFT JOIN (embarrassing, but true), and as someone who reviews technical assessments for data roles. Here is what I have learned: most SQL interview prep guides are useless. They cover 200+ questions, half of which you will never see in a real interview, and they skip the practical scenarios that actually stump candidates. This guide is different. These are the 30 SQL questions I have seen come up repeatedly across interviews at banks, tech companies, startups, and consulting firms — both in Azerbaijan and internationally. Each question includes the concept being tested, the expected answer, and the follow-up question the interviewer will likely ask.
How SQL Interviews Actually Work
Before diving into questions, let me set expectations. SQL interviews come in three formats:
| Format | Duration | Used By | What They Test |
|---|---|---|---|
| Take-home SQL test | 1-3 hours | Startups, mid-size companies | Real-world query writing, data cleaning |
| Live coding (shared screen) | 30-45 min | FAANG, large tech companies | Problem-solving speed, thought process |
| Whiteboard/verbal | 20-30 min | Banks, consulting firms | Conceptual understanding, schema design |
For all formats, the interviewer is looking for three things: (1) Can you write correct SQL? (2) Can you optimize it? (3) Can you explain your reasoning? Let us get into the questions.
Section 1: Fundamentals (Questions 1-10)
Question 1: What is the difference between WHERE and HAVING?
Concept: Filtering before vs. after aggregation.
Answer: WHERE filters individual rows before grouping. HAVING filters groups after GROUP BY and aggregation. You cannot use aggregate functions in WHERE — you must use HAVING.
Follow-up: "Can you use WHERE and HAVING in the same query?" Yes. WHERE filters rows first, then GROUP BY groups the remaining rows, then HAVING filters groups.
Question 2: Explain the difference between INNER JOIN, LEFT JOIN, RIGHT JOIN, and FULL OUTER JOIN
Concept: Join types and their behavior with non-matching rows.
Answer:
- INNER JOIN: Returns only matching rows from both tables.
- LEFT JOIN: Returns all rows from the left table + matching rows from the right. Non-matching right rows are NULL.
- RIGHT JOIN: Returns all rows from the right table + matching rows from the left.
- FULL OUTER JOIN: Returns all rows from both tables. Non-matching rows are filled with NULL.
Follow-up: "When would you use a LEFT JOIN instead of an INNER JOIN?" When you need all records from the primary table even if there is no match — e.g., all customers including those with zero orders.
Question 3: What is the difference between UNION and UNION ALL?
Concept: Set operations and deduplication.
Answer: UNION combines results and removes duplicates (like DISTINCT). UNION ALL combines results and keeps all rows, including duplicates. UNION ALL is faster because it skips the deduplication step.
Follow-up: "When should you always use UNION ALL?" When you know there are no duplicates, or when you intentionally want to keep duplicates.
Question 4: What does GROUP BY do, and what happens if you include a non-aggregated column?
Concept: Aggregation rules.
Answer: GROUP BY groups rows with the same values in specified columns, allowing aggregate functions (COUNT, SUM, AVG, etc.) to operate on each group. In standard SQL, every column in SELECT must either be in GROUP BY or wrapped in an aggregate function. MySQL has a loose mode that allows non-aggregated columns (picking an arbitrary value), but this is considered bad practice.
Follow-up: "What is the order of SQL execution?" FROM -> WHERE -> GROUP BY -> HAVING -> SELECT -> ORDER BY -> LIMIT.
Question 5: Write a query to find the second highest salary
Concept: Subqueries, LIMIT/OFFSET, or window functions.
Answer (Method 1 — Subquery):
SELECT MAX(salary) FROM employees
WHERE salary < (SELECT MAX(salary) FROM employees);
Answer (Method 2 — Window function):
SELECT salary FROM (
SELECT salary, DENSE_RANK() OVER (ORDER BY salary DESC) as rnk
FROM employees
) ranked
WHERE rnk = 2;
Follow-up: "What is the difference between RANK(), DENSE_RANK(), and ROW_NUMBER()?" RANK skips numbers after ties (1,1,3), DENSE_RANK does not skip (1,1,2), ROW_NUMBER gives unique numbers (1,2,3).
Question 6: What is a self-join? Give an example
Concept: Joining a table with itself.
Answer: A self-join joins a table to itself using aliases. Common use case: finding employees and their managers when both are in the same table.
SELECT e.name AS employee, m.name AS manager
FROM employees e
LEFT JOIN employees m ON e.manager_id = m.id;
Question 7: What is the difference between DELETE, TRUNCATE, and DROP?
Concept: Data removal methods.
Answer:
- DELETE: Removes specific rows (can use WHERE). Logged, can be rolled back. Slow for large tables.
- TRUNCATE: Removes all rows. Minimal logging, faster. Cannot use WHERE. Resets auto-increment.
- DROP: Removes the entire table (structure + data).
Question 8: What are indexes? When should you use them?
Concept: Query performance optimization.
Answer: An index is a data structure (usually B-tree) that speeds up data retrieval. Use indexes on columns frequently used in WHERE, JOIN, ORDER BY, and GROUP BY clauses. Trade-off: indexes speed up reads but slow down writes (INSERT, UPDATE, DELETE) because the index must be updated.
Follow-up: "What is a composite index?" An index on multiple columns. Column order matters — the index is used left-to-right (leftmost prefix rule).
Question 9: What is normalization? Explain 1NF, 2NF, and 3NF
Concept: Database design principles.
- 1NF: Each column contains atomic (indivisible) values. No repeating groups.
- 2NF: 1NF + no partial dependency (all non-key columns depend on the entire primary key).
- 3NF: 2NF + no transitive dependency (non-key columns do not depend on other non-key columns).
Follow-up: "When would you denormalize?" When read performance is critical and write frequency is low — e.g., analytics tables, data warehouses.
Question 10: What is the difference between a subquery and a CTE?
Concept: Query organization.
Answer: A CTE (Common Table Expression, using WITH) is a named temporary result set that exists only for the duration of the query. It improves readability over nested subqueries and can be referenced multiple times. A subquery is a query nested inside another query. CTEs are generally preferred for complex queries because they are easier to read and debug.
Follow-up: "Can a CTE be recursive?" Yes. Recursive CTEs can reference themselves and are useful for hierarchical data (org charts, tree structures).
Section 2: Intermediate (Questions 11-20)
Question 11: Write a query to find duplicate rows
Concept: GROUP BY + HAVING for deduplication detection.
SELECT email, COUNT(*) as cnt
FROM users
GROUP BY email
HAVING COUNT(*) > 1;
Follow-up: "How would you delete the duplicates, keeping only one?" Use ROW_NUMBER() to identify duplicates, then delete those with row_number > 1.
Question 12: What is COALESCE and when do you use it?
Concept: NULL handling.
Answer: COALESCE returns the first non-NULL value from a list of arguments. Use it to provide default values for NULL columns.
SELECT COALESCE(phone, email, 'No contact') AS contact_info
FROM customers;
Question 13: Explain window functions with PARTITION BY
Concept: Analytics across row groups without collapsing them.
Answer: Window functions perform calculations across a set of rows related to the current row. Unlike GROUP BY, they do not collapse rows.
SELECT
department,
employee_name,
salary,
AVG(salary) OVER (PARTITION BY department) as dept_avg_salary,
salary - AVG(salary) OVER (PARTITION BY department) as diff_from_avg
FROM employees;
This query shows each employee's salary alongside the department average — something impossible with just GROUP BY.
Question 14: Write a query to calculate running total
Concept: Window functions with ORDER BY.
SELECT
order_date,
amount,
SUM(amount) OVER (ORDER BY order_date) as running_total
FROM orders;
Follow-up: "How would you calculate a 7-day moving average?" Use ROWS BETWEEN 6 PRECEDING AND CURRENT ROW.
Question 15: What is the difference between EXISTS and IN?
Concept: Subquery performance.
Answer: IN checks if a value matches any value in a list/subquery. EXISTS checks if a subquery returns any rows. EXISTS is typically faster for large subqueries because it stops at the first match. IN can be faster for small lists. EXISTS handles NULLs better — IN with NULL in the list can produce unexpected results.
Question 16: Write a query to find employees who earn more than their department average
Concept: Correlated subqueries or window functions.
SELECT employee_name, salary, department
FROM employees e
WHERE salary > (
SELECT AVG(salary) FROM employees
WHERE department = e.department
);
Or with a window function:
SELECT * FROM (
SELECT
employee_name, salary, department,
AVG(salary) OVER (PARTITION BY department) as dept_avg
FROM employees
) sub
WHERE salary > dept_avg;
Question 17: What is a CROSS JOIN?
Concept: Cartesian product.
Answer: CROSS JOIN returns the Cartesian product — every row from the first table combined with every row from the second table. If table A has 100 rows and table B has 50 rows, the result has 5,000 rows. Use case: generating all possible combinations (e.g., all product-color combinations).
Question 18: How do you handle NULL values in aggregations?
Concept: NULL behavior in SQL.
Answer: Most aggregate functions (SUM, AVG, COUNT) ignore NULLs. COUNT(*) counts all rows including NULLs, but COUNT(column_name) only counts non-NULL values. NULL comparisons with = or != always return NULL (use IS NULL / IS NOT NULL instead).
Question 19: Write a query to pivot rows into columns
Concept: Data transformation / pivoting.
SELECT
student_name,
MAX(CASE WHEN subject = 'Math' THEN score END) AS math,
MAX(CASE WHEN subject = 'English' THEN score END) AS english,
MAX(CASE WHEN subject = 'Science' THEN score END) AS science
FROM scores
GROUP BY student_name;
Follow-up: "What if you don't know the column names in advance?" You would need dynamic SQL or handle it in the application layer.
Question 20: What is a correlated subquery? How is it different from a regular subquery?
Concept: Subquery execution model.
Answer: A correlated subquery references columns from the outer query and is executed once for each row of the outer query. A regular subquery is executed once, independently. Correlated subqueries can be slower (O(n*m) vs O(n+m)) but are sometimes the clearest way to express a query.
Section 3: Advanced (Questions 21-30)
Question 21: Explain EXPLAIN / EXPLAIN ANALYZE
Concept: Query performance analysis.
Answer: EXPLAIN shows the query execution plan — how the database plans to execute your query (which indexes it uses, join order, estimated costs). EXPLAIN ANALYZE actually runs the query and shows real execution times. Use EXPLAIN to identify full table scans, missing indexes, and inefficient joins.
Follow-up: "What is a full table scan and why is it bad?" It reads every row in the table. It is slow for large tables when you only need a few rows. Adding an appropriate index turns it into an index scan.
Question 22: What is a deadlock? How do you prevent it?
Concept: Concurrency and locking.
Answer: A deadlock occurs when two transactions are waiting for each other to release locks, creating a circular dependency. Prevention strategies: always acquire locks in the same order, keep transactions short, use row-level locking instead of table-level, and set appropriate lock timeouts.
Question 23: Write a query to find gaps in a sequence
Concept: Self-joins or window functions for gap analysis.
SELECT
id + 1 as gap_start,
next_id - 1 as gap_end
FROM (
SELECT id, LEAD(id) OVER (ORDER BY id) as next_id
FROM sequence_table
) sub
WHERE next_id - id > 1;
Question 24: What is the difference between OLTP and OLAP?
Concept: Database workload types.
| Aspect | OLTP | OLAP |
|---|---|---|
| Purpose | Day-to-day transactions | Analytics, reporting |
| Queries | Simple, frequent, short | Complex, infrequent, long |
| Data model | Normalized (3NF) | Denormalized (star/snowflake schema) |
| Examples | PostgreSQL, MySQL | BigQuery, Redshift, Snowflake |
Question 25: Write a query to calculate month-over-month growth rate
Concept: LAG window function for time series analysis.
SELECT
month,
revenue,
LAG(revenue) OVER (ORDER BY month) as prev_month_revenue,
ROUND(
(revenue - LAG(revenue) OVER (ORDER BY month))
/ LAG(revenue) OVER (ORDER BY month) * 100, 2
) as growth_pct
FROM monthly_revenue;
Question 26: What are materialized views?
Concept: Pre-computed query results.
Answer: A materialized view stores the result of a query physically, unlike a regular view which is just a saved query that runs on access. Materialized views are faster to query but can become stale. They need to be refreshed (manually or on a schedule). Use them for expensive aggregation queries that do not need real-time data.
Question 27: Write a query to find the top 3 products per category by revenue
Concept: Top-N per group (very common in real interviews).
WITH ranked AS (
SELECT
category,
product_name,
SUM(revenue) as total_revenue,
ROW_NUMBER() OVER (PARTITION BY category ORDER BY SUM(revenue) DESC) as rn
FROM sales
GROUP BY category, product_name
)
SELECT category, product_name, total_revenue
FROM ranked
WHERE rn <= 3;
Question 28: What is a transaction? Explain ACID properties
Concept: Data integrity.
- Atomicity: All operations in a transaction succeed, or all fail (no partial state).
- Consistency: Transaction brings the database from one valid state to another.
- Isolation: Concurrent transactions do not interfere with each other.
- Durability: Once committed, data survives crashes.
Question 29: How do you optimize a slow query?
Concept: Performance tuning methodology.
Answer (Step by step):
- Run EXPLAIN ANALYZE to see the execution plan.
- Look for full table scans — add indexes on filtered/joined columns.
- Check for missing WHERE clauses or overly broad conditions.
- Replace correlated subqueries with JOINs or CTEs.
- Use appropriate data types (do not store dates as strings).
- Consider denormalization or materialized views for analytics queries.
- Check for N+1 query patterns in the application layer.
- Monitor index usage — drop unused indexes (they slow down writes).
Question 30: Design a schema for a job board (like BirJob)
Concept: Schema design, real-world modeling.
Answer:
-- Core tables
CREATE TABLE companies (
id SERIAL PRIMARY KEY,
name VARCHAR(255) NOT NULL,
website VARCHAR(500),
logo_url VARCHAR(500),
created_at TIMESTAMP DEFAULT NOW()
);
CREATE TABLE jobs (
id SERIAL PRIMARY KEY,
company_id INTEGER REFERENCES companies(id),
title VARCHAR(255) NOT NULL,
description TEXT,
location VARCHAR(255),
salary_min DECIMAL,
salary_max DECIMAL,
job_type VARCHAR(50), -- full-time, part-time, contract
apply_url VARCHAR(500) UNIQUE,
is_active BOOLEAN DEFAULT true,
scraped_at TIMESTAMP,
created_at TIMESTAMP DEFAULT NOW()
);
CREATE TABLE categories (
id SERIAL PRIMARY KEY,
name VARCHAR(100) NOT NULL
);
CREATE TABLE job_categories (
job_id INTEGER REFERENCES jobs(id),
category_id INTEGER REFERENCES categories(id),
PRIMARY KEY (job_id, category_id)
);
-- Indexes for common queries
CREATE INDEX idx_jobs_company ON jobs(company_id);
CREATE INDEX idx_jobs_active ON jobs(is_active) WHERE is_active = true;
CREATE INDEX idx_jobs_created ON jobs(created_at DESC);
Follow-up: "How would you handle searching across job titles and descriptions?" Full-text search index (tsvector in PostgreSQL) or integration with Elasticsearch.
SQL Interview Preparation Strategy
Week 1: Fundamentals (Questions 1-10)
- Practice basic SELECT, WHERE, JOIN, GROUP BY queries
- Use a practice database (PostgreSQL + pgAdmin, or SQLite online)
- Complete 20-30 easy problems on LeetCode SQL or HackerRank SQL
Week 2: Intermediate (Questions 11-20)
- Focus on window functions — these are the most common differentiator
- Practice with real datasets (Kaggle datasets loaded into a database)
- Complete 20-30 medium problems
Week 3: Advanced (Questions 21-30)
- Study query optimization and EXPLAIN output
- Practice schema design problems
- Do timed practice — simulate real interview conditions
Week 4: Mock Interviews
- Practice with a friend or on Pramp
- Explain your reasoning out loud while writing queries
- Review company-specific SQL questions (if available)
For a broader look at data analyst career preparation, check out our Data Analyst Roadmap for 2026.
What I Actually Think
SQL is the most underrated skill in tech. Every developer, data analyst, product manager, and even marketer benefits from knowing SQL well. Yet most people learn just enough to get by — SELECT * FROM table WHERE id = 1 — and never go deeper.
The gap between "I know SQL" and "I am good at SQL" is where interview offers live. Window functions, CTEs, query optimization, and schema design — these are the skills that separate a junior analyst from a senior one. And they are not hard to learn. Four weeks of focused practice is enough to go from beginner to interview-ready.
My advice: do not just memorize queries. Understand why each approach works. When you understand that a window function lets you calculate aggregates without losing row-level detail, you can apply that concept to any problem — you do not need to have seen the exact question before.
Also, practice on real data. Download a Kaggle dataset (e-commerce transactions, Spotify data, whatever interests you), load it into PostgreSQL, and write queries to answer real questions. "What are the top 5 products by revenue per month?" "Which customers have not ordered in 90 days?" This kind of practice is 10x more valuable than abstract LeetCode problems.
Action Plan
- Today: Set up a local PostgreSQL database (or use an online SQL playground like db-fiddle.com).
- This week: Review Questions 1-10 from this guide. Practice each one until you can write the query without looking.
- Week 2: Master window functions (RANK, ROW_NUMBER, LAG, LEAD, SUM OVER). These appear in 60%+ of SQL interviews.
- Week 3: Practice schema design. Draw ER diagrams for systems you use daily (e-commerce, social media, booking system).
- Week 4: Do 3-5 mock SQL interviews. Time yourself. Explain your reasoning out loud.
Sources
- PostgreSQL Documentation — postgresql.org/docs
- LeetCode Database Problems — leetcode.com/problemset/database
- HackerRank SQL Challenges — hackerrank.com/domains/sql
- "SQL Performance Explained" by Markus Winand — use-the-index-luke.com
- BirJob technical interview data — birjob.com
I'm Ismat, and I build BirJob — Azerbaijan's job aggregator scraping 80+ sources daily.
