Mastering SQL Interviews: Top 5 Commonly Asked Problems and How to Solve Them

5.28K 0 0 0 0

Chapter 5: Aggregating Data with GROUP BY and HAVING

🧠 Introduction

SQL’s power lies in its ability to aggregate and summarize data. The GROUP BY clause is the go-to tool for performing aggregate operations like totals, averages, counts, and more — especially when grouped by specific attributes such as department, date, region, or customer.

Interviewers often ask:

“Find departments with more than 2 employees.”
“List customers who made more than $10,000 in purchases.”
“Find products with an average rating above 4.”

All these rely on GROUP BY and its companion filter: HAVING.

In this chapter, you’ll learn:

  • How GROUP BY works
  • When and why to use HAVING vs WHERE
  • How to use multiple aggregate functions
  • Real-world use cases and common interview challenges

🗃️ Sample Table: Orders

🧾 orders

order_id

customer_id

product

quantity

amount

order_date

1

101

Mouse

2

500

2024-01-01

2

102

Keyboard

1

700

2024-01-02

3

101

Laptop

1

30000

2024-01-02

4

103

Monitor

1

15000

2024-01-03

5

101

Mouse

1

250

2024-01-04

6

104

Keyboard

3

2100

2024-01-04

7

102

Mouse

2

500

2024-01-05


🔍 Problem 1: Total Amount Spent by Each Customer

SELECT customer_id, SUM(amount) AS total_spent

FROM orders

GROUP BY customer_id;

Output:

customer_id

total_spent

101

30750

102

1200

103

15000

104

2100

We’re summarizing (SUM) by grouping on customer_id.


Problem 2: Customers Who Spent More Than 10,000

SELECT customer_id, SUM(amount) AS total_spent

FROM orders

GROUP BY customer_id

HAVING SUM(amount) > 10000;

Output:

customer_id

total_spent

101

30750

103

15000

Use HAVING to filter grouped results.
🔥 HAVING works after aggregation; WHERE works before.


💡 WHERE vs HAVING

Clause

Used For

Applied When

WHERE

Filtering rows

Before aggregation

HAVING

Filtering groups

After aggregation

Example:

-- Only orders > 1000

SELECT customer_id, SUM(amount)

FROM orders

WHERE amount > 1000

GROUP BY customer_id;


🔄 Multiple Aggregates Example

Show total quantity and amount for each product.

SELECT product,

       SUM(quantity) AS total_qty,

       SUM(amount) AS total_sales

FROM orders

GROUP BY product;


️ Using GROUP BY with Multiple Columns

Show total sales per customer per product.

SELECT customer_id, product, SUM(amount) AS total_sales

FROM orders

GROUP BY customer_id, product;


🎯 Real-World Use Cases

  • Revenue by customer, region, or time period
  • Counting number of logins per user
  • Average session duration per device
  • Product popularity by category

🧑💻 Interview Challenge

Find all products that have been ordered more than twice.

SELECT product, COUNT(*) AS order_count

FROM orders

GROUP BY product

HAVING COUNT(*) > 2;

Output:

product

order_count

Mouse

3


📊 Combining with ORDER BY and LIMIT

Top 2 customers by total spending

SELECT customer_id, SUM(amount) AS total_spent

FROM orders

GROUP BY customer_id

ORDER BY total_spent DESC

LIMIT 2;


🧠 Common Mistakes

Mistake

Why it’s wrong

Using WHERE with aggregate functions

Aggregates can’t be used in WHERE

Forgetting GROUP BY when using SUM, AVG, etc.

SQL will throw an error

Mixing aggregated and non-aggregated columns

Only columns in GROUP BY or aggregates can be in SELECT


Performance Tips

  • Index the columns used in GROUP BY to speed up queries
  • Avoid grouping on unnecessary columns — the more you group, the more work SQL does
  • Use LIMIT or TOP when fetching ranked/grouped results

Summary

  • GROUP BY is used to summarize data by one or more columns.
  • HAVING filters results after aggregation — unlike WHERE.
  • You can use multiple aggregation functions in a single query.
  • Real-world applications include dashboards, KPIs, reports, and fraud detection.

Back

FAQs


1. What are the most commonly asked SQL questions in interviews?

Interviewers often ask about joins, groupings, window functions, subqueries, and ranking problems. These test both your understanding of SQL syntax and logic.

2. How should I prepare for SQL interviews?

Focus on solving real-world problems, practice on platforms like LeetCode or HackerRank, and understand how SQL is used in business scenarios.

3. Do I need to memorize queries or understand logic?

Understanding logic is more important. Interviewers prefer candidates who can reason through a problem over those who rely on memorized solutions.

4. Are window functions important for SQL interviews?

Yes. Functions like ROW_NUMBER(), RANK(), and DENSE_RANK() often appear in intermediate to advanced interviews.

5. What’s the best way to practice SQL?

Use real datasets, explore open SQL challenges online, and build your own problems from business scenarios.

6. Can I use aliases and CTEs in interview solutions?

Absolutely. Using aliases and CTEs can make your queries cleaner and more readable, which is often appreciated in interviews.

7. How do I handle NULL values in SQL queries?

Learn to use IS NULL, COALESCE(), and proper filtering logic to handle NULLs, as they can affect groupings and joins.

8. Are joins and subqueries tested frequently?

Yes. Inner joins, left joins, and subqueries are very common in interviews, especially when querying across multiple tables.

9. Is it okay to discuss multiple approaches in an interview?

Yes. Explaining both brute-force and optimized approaches shows depth in understanding and problem-solving.

10. How long does it take to get good at SQL for interviews?

With consistent practice, most people can get proficient in 2–4 weeks, especially if they focus on real interview problems.

Tutorials are for educational purposes only, with no guarantees of comprehensiveness or error-free content; TuteeHUB disclaims liability for outcomes from reliance on the materials, recommending verification with official sources for critical applications.