DBA – Solution to Exercise 4 on Select with Group By

I hope that you played the game and try 1 or more SQL queries to answer the request from the Database Administrator provided activity.

Once again we try on an approach easy to understand with a real world use case. Do not hesitate to post a comment to discuss about it.

Context reminder

Using your favorite SQL server (ex : PostgreSQL), you should have the following rows :

Student sql rows
Students
discipline sql rows
Disciplines
Score sql rows
Score rows +64

Solution

Well this one should be handled step by step except if you have a good understanding of SQL !

Take the request and divide it so we can deal with it later :

  1. Pick up all students having at least 3 marks above 8 :
SELECT student.name as student, student.id, score 
FROM score
INNER JOIN student ON score.student_id = student.id 
WHERE score > 8  
GROUP BY student.id, student.name, score 
HAVING count(score) > 2 ORDER BY student.name;

Here you are :

Score & Student - group by Having

There is more data using scores above 8. You may check the results on your own.

Now the point is that if we need to show the relative disciplines names, we will end up with 3 or more lines for each student (actually only Megane is concerned :)).

Megane has got 3 disciplines on which she obtained a score of 10.

How do we list them along with the rest ?

Well that is not so obvious using the last query; think about it :

Do you think you can have 1 discipline having more than 2 scores above 9 for the same student ?

Basically the HAVING clause is filtering on group elements and if the last doesn’t match, then no result is returned.

More details about the Group By and Having clauses are available on DataCamp.

Following is what I propose as a solution, you may use the former query as a SUB QUERY (you could also use a WITH Query however it is not necessary for such simple use case) :

SELECT discipline.name, student_name, scoring 
FROM discipline 
INNER JOIN score ON score.discipline_id = discipline.id
INNER JOIN (SELECT st.name as student_name, s.score as scoring , st.id as student_id
	FROM score s
	INNER JOIN student st ON s.student_id = st.id 
	WHERE s.score > 9  
	GROUP BY st.id, st.name, scoring 
	HAVING count(s.score) > 2 ORDER BY st.name) as sub_score ON sub_score.scoring = score.score AND sub_score.student_id = score.student_id;

Now using the WITH AS query (just for fun !) :

WITH sub_score AS (SELECT st.name as student_name, s.score as scoring , st.id as student_id
	FROM score s
	INNER JOIN student st ON s.student_id = st.id 
	WHERE s.score > 9  
	GROUP BY st.id, st.name, scoring 
	HAVING count(s.score) > 2 ORDER BY st.name)
SELECT discipline.name, student_name, scoring 
FROM discipline 
INNER JOIN score ON score.discipline_id = discipline.id
INNER JOIN sub_score ON sub_score.scoring = score.score AND sub_score.student_id = score.student_id;
Inner join with group by query

Both queries have the same query plan so pick up your favorite syntax !

Query optimization

As usual, let’s see if we can improve the overall performance.

There are 2 spots which seems to be interesting :

  1. the score is used for aggregation
  2. The student name is used for ordering
CREATE INDEX idx_score 
ON score(score);

Using the above index, we notice that Postgres is using another path to process the request, there is a slight improvement.

CREATE INDEX idx_name
ON student(name);

Now we are talking ! The cost of the query has been reduced significantly, you may now play with billions of students ;).

If you take a look at the query plan of each configuration, you will notice the change about the cost of the Sort process, it went from ~70 to approximately ~3.55.

Query plan - Group by query - Postgres

That’s it for today, I’ll be welcoming any constructive comment to help us improve our lines.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Want more information?

Related links will be displayed within articles for you to pick up another good spot to get more details about software development, deployment & monitoring.

Stay tuned by following us on Youtube.

%d bloggers like this: