SQL GROUP BY Statement Explained with Practical Examples

SQL GROUP BY Statement

Data Grouping and Data Aggregation are the important concepts of SQL. There are circumstances where we need to apply aggregate function not only to a single set of rows, but also to a group of rows that have the same values, so in such a situation, the SQL GROUP BY statement is used with SQL SELECT statement.

The GROUP BY clause is part of the SQL SELECT statement. Optionally it is used in conjunction with aggregate functions to produce the resulting group of rows from the database.

Advertisement

SQL GROUP BY clause is placed after the WHERE clause and before the ORDER BY clause in SQL select statement.

 

SQL GROUP BY Statement Syntax

GROUP BY ( 
   column-Name [ , column-Name ]*  | 
   ROLLUP ( column-Name [ , column-Name ]* | 
   CUBE (column-Name [, column-Name ]* |  
   GROUPING SETS (column-Name [, column-Name ]* 
)

column-Name: Specifies a column or a non-aggregate calculation on a column. This column can belong to a table, derived table, or view. The column must appear in the FROM clause of the SELECT statement but is not required to appear in the SELECT list.

GROUP BY (column-Name [, column-Name] *): Groups the SELECT statement results according to the values in a list of one or more column expressions.  

 

GROUP BY ROLLUP ()

  • ROLLUP is a simple extension to the SQL Group by clause, ROLLUP enables a SQL SELECT statement to calculate multiple levels of subtotals across a specified group of dimensions.
  • In addition, GROUP BY ROLLUP to aggregate data and improve the data analytical capabilities in SQL Server.

 

GROUP BY CUBE ()

  • GROUP BY CUBE is also an extension of the GROUP BY clause similar to GROUP BY ROLLUP. GROUP BY CUBE creates groups for all possible combinations of columns.
  • CUBE is especially valuable in queries that use columns from multiple dimensions rather than columns representing different levels of a single dimension.
  • In addition to producing all the rows of a GROUP BY ROLLUP, GROUP BY CUBE adds all the “cross-tabulations” rows. Sub-total rows are rows that further aggregate whose values are derived by computing the same aggregate functions that were used to produce the grouped rows.

 

GROUP BY GROUPING SETS ()

  • GROUP BY GROUPING SETS is a powerful extension of the GROUP BY clause that allows computing multiple group-by clauses in a single statement. It is either used with ROLLUP or CUBE operator.
  • The GROUPING SETS option gives you the ability to combine multiple GROUP BY clauses into one GROUP BY clause. The results are the equivalent of UNION ALL of the specified groups.
  • For example, GROUP BY ROLLUP (Country, Region) and GROUP BY GROUPING SETS (ROLLUP (Country, Region)) return the same results.
  • Using a single column as its argument, GROUPING returns 1 when it encounters a NULL value created by a ROLLUP or CUBE operation. That is, if the NULL indicates the row is a subtotal, GROUPING returns a 1. Any other type of value, including a stored NULL, will return a 0.

 

SQL GROUP BY Statement with SELECT Clause Syntax

SELECT clause with WHERE Condition:

SELECT [column_name | aggregate_function_name(column_name),... ]   
FROM [table_name] 
[WHERE conditions] 
GROUP BY column_name1[, column_name2...] 
[ORDER BY expression [ ASC | DESC]];

SELECT clause with HAVING Condition:

SELECT [column_name | aggregate_function_name(column_name),... ] 
GROUP BY column_name1[, column_name2...]  
[HAVING condition];

 

SQL GROUP BY Statement | Aggregate Functions

Aggregate functions are functions that take a set of rows as input and return a single value. In SQL we have five aggregate functions which are also called multirow functions as follows.

Advertisement
  • SUM (): Returns the sum or total of each group.
  • COUNT (): Returns the number of rows of each group.
  • AVG (): Returns the average and mean of each group.
  • MIN (): Returns the minimum value of each group.
  • MAX (): Returns the minimum value of each group.

For examples of SQL GROUP BY Statement, here we are considering some tables of school management systems:

SQL GROUP BY Statement Explained with Practical Examples

The Records of above tables are as follow:

Student Table: 

student_idstudentnameadmissionnoadmissiondateenrollmentnodate_of_birthemailcityclass_id
101reema1000102-02-2000e1520000202-02-1990reema@gmail.comsurat2
102kriya1000204-05-2001e1620000304-08-1991kriya@gmail.comsurat1
103meena1000306-05-1999e1520000402-09-1989meena@gmail.comvadodara3
104carlin200104-01-1998e1420000104-04-1989carli@gmail.comvapi1
105dhiren200202-02-1997e1340000202-02-1987dhiru@gmail.comvapi2
106hiren200301-01-1997e1340000103-03-1887hiren@gmail.comsurat2
107mahir1000406-09-2000e1520000307-09-1990mahi@gmail.comvapi3
108nishi200402-04-2001e1620000103-02-1991nishi@gmail.comvadodara1

Result Table:

result_idstudent_idexamnameexamdatesubjectobtainmarktotalmarkspercentagegradestatus
3001101sem107-08-200118010080A+pass
3002101sem108-08-200127610076A+pass
3003102sem305-05-200036710067Apass
3004102sem306-05-200048910089A+pass
3005102sem307-05-200059010090A+pass
3006103sem508-09-199865510055Bpass
3007103sem509-09-199873010030Dfail
3008103sem510-09-199883410034Dfail

Subject Table:

subjectidfacultynamesubjectnamesubjectcode
1krishnac1003
2rahulcpp1004
3radhaasp1005
4meerasql1006
5yasodacloud1007
6nadancg1008

 

SQL GROUP BY Statement | Single column Example

Count the number of students in each class using SQL GROUP BY Statement 

SELECT class_id, COUNT (*) AS 'Total student' 
FROM  tblstudent 
GROUP BY class_id;
  • To get the result of student count class-wise, we applied the SQL GROUP BY clause on class_id
  • The COUNT function is used to make counting, count (*) counts the number of rows produced by the query.

OUTPUT:

SQL GROUP BY Statement Explained with Practical Examples

 

SQL GROUP BY WITH JOIN Example | Multiple columns

Use SQL GROUP BY Statement with join query to get detail of students who obtained maximum marks in his/her semester 

Advertisement
SELECT tblstudent.studentname, MAX(tblresult.obtainmark) AS 'Maximum Mark', tblresult.examname 
FROM  tblstudent INNER JOIN 
tblresult ON tblstudent.student_id = tblresult.student_id 
GROUP BY tblresult.examname, tblstudent.studentname;
  • In the above query, we have to use the MAX function to find maximum marks from each group of records which extract using group by clause.
  • INNER JOIN is used to get data from both the tables, student name from tblstudent and obtain mark and examname from tblresult.

OUTPUT:

SQL GROUP BY Statement Explained with Practical Examples

 

SQL GROUP BY | INNER JOIN & WHERE Clause Example

To find the total number of passing students in each subject use SQL GROUP BY Statement with WHERE condition and INNER JOIN.  

SELECT  COUNT (*) AS 'total pass', tblsubject.subjectname 
FROM   tblsubject CROSS JOIN tblresult 
WHERE        (tblresult.status = 'pass') 
GROUP BY tblsubject.subjectname;
  • In the previous query, we make use of WHERE, GROUP BY, and INNER JOIN to make a group of records of each subject we applied group by subjectname, as we want to find passing student count, we put the condition as status as pass.

OUTPUT:

SQL GROUP BY Statement Explained with Practical Examples

 

SQL GROUP BY Statement | Having Clause

HAVING Clause is used as a conditional statement with GROUP BY Clause in SQL. WHERE Clause cannot be combined with aggregate results so Having clause is used which returns rows where aggregate function results matched with given conditions only.

 

INNER JOIN & HAVING Clause Example

Retrieve the detail of students who secure an average of more than 60 marks in each semester with the help of SQL GROUP BY and HAVING clause

SELECT  AVG(tblresult.obtainmark) AS 'Average Mark', tblstudent.studentname, tblresult.examname AS semester 
FROM   tblresult INNER JOIN 
tblstudent ON tblresult.student_id = tblstudent.student_id 
GROUP BY tblresult.examname, tblstudent.studentname 
HAVING  (AVG(tblresult.obtainmark) > 60);
  • In the above query, we use AVG, INNER JOIN GROUP BY, AND HAVING Clauses, when we applied GROUP BY it will make groups of students semester wise then make inner join result table with students to retrieve detail of student, AVG function is used to calculate average obtained marks, and to check average marks more than 60 we applied HAVING clause.

OUTPUT:

SQL GROUP BY Statement Explained with Practical Examples

 

SQL GROUP BY Statement | ROLLUP clause with GROUP BY clause

Using SQL GROUP BY Statement extract average and a maximum of obtained marks in each semester and make sub-average and sub-max of resulting set 

SELECT  examname, AVG(obtainmark) AS avg_grades, MAX(obtainmark) AS Maximum 
FROM   tblresult 
GROUP BY ROLLUP (examname);
  • As we already study ROLLUP produces a result set containing the resulting group of SQL GROUP BY clause and the subgroup of values.
  • In the above example we first make a group of exam name-wise records of table results then applied AVG and MAX aggregate functions on that group, it gives another result set of AVG and MAX obtained marks student records, from which rollup will obtain sub-AVG and sub-MAX.

OUTPUT:

SQL GROUP BY Statement Explained with Practical Examples

 

SQL GROUP BY Statement | CUBE Clause WITH GROUP BY

ROLLUP and CUBE will give you the same result if we applied with only one column name. To see the difference between both we need to use two columns in GROUP BY 

SELECT  tblresult.examname, tblsubject.subjectname, AVG(tblresult.obtainmark) AS avg_grades, MAX(tblresult.obtainmark) AS Maximum 
FROM   tblsubject INNER JOIN 
 tblresult ON tblsubject.subjectid = tblresult.subjectid 
GROUP BY CUBE (tblsubject.subjectname, tblresult.examname);
  • In exceeding example, we applied GROUP BY with CUBE to get cross-tabulation rows of first subject name and then semester wise.
  • To get the result subject name wise we also used SQL INNER JOIN in the above example.

OUTPUT:

SQL GROUP BY Statement Explained with Practical Examples

 

SQL GROUP BY Statement | GROUPING & HAVING CLAUSE

In the previous section we have learned about the use of GROUPING with GROUP BY, to filter the resulting set of CUBE AND ROLLUP, we applied the same concept in the below query 

SELECT tblresult.examname, tblsubject.subjectname, AVG(tblresult.obtainmark) AS avg_grades, MAX(tblresult.obtainmark) AS Maximum 
FROM tblsubject INNER JOIN tblresult ON tblsubject.subjectid = tblresult.subjectid 
GROUP BY CUBE (tblsubject.subjectname, tblresult.examname) 
HAVING    (GROUPING(tblresult.examname) = 1) 
ORDER BY tblsubject.subjectname;
  • In this query, we have applied GROUPING and HAVING clause with SQL GROUP BY Statement, when we cubing the result set of GROUPS BY with two columns the result we show as each column-wise Super aggregation result same in out query, we get first subject-wise and then exam name wise aggregation of AVG and MAX marks obtained data.
  • So, to filter NULL result data from this we make SUB-GROUPING at one column.

OUTPUT:

SQL GROUP BY Statement Explained with Practical Examples

 

Summary

The SQL GROUP BY clause and SQL HAVING clause are powerful clauses of SQL, specifically used to analyze large amounts of data. In this article, we covered SQL GROUP BY with WHERE, HAVING, ROLLUP, CUBE, GROUPING, ORDER BY, and also some examples of INNER JOIN also.

 

References

https://docs.oracle.com/javadb/10.8.3.0/ref/rrefsqlj32654.html

 

Further Reading

https://docs.microsoft.com/en-us/sql/t-sql/queries/select-group-by-transact-sql?view=sql-server-ver15

 

Didn't find what you were looking for? Perform a quick search across GoLinuxCloud

If my articles on GoLinuxCloud has helped you, kindly consider buying me a coffee as a token of appreciation.

Buy GoLinuxCloud a Coffee

For any other feedbacks or questions you can either use the comments section or contact me form.

Thank You for your support!!

Leave a Comment