Overview of SQL Distinct keyword
SQL Distinct keyword is used with SQL Select operation to eliminate duplicate rows from the result set and is specified before the column name with the SQL Select statement to fetch only the unique values of a particular column in the resulting recordset
SQL Distinct statement is used to return only unique (Different) values of a specified column or column list from the table of a database, The SELECT DISTINCT FROM statement only works with a single column or with a set of related columns
SQL Distinct Syntax
SELECT DISTINCT [TOP <n>] <column_name | expression_list> FROM <table | model >
[WHERE <condition list >]
[ORDER BY <expression>]
Here,
- n: specified with TOP keyword to define how many rows to return, it is Optional argument
- column_name | expression_list: Specified the list of column name or expressions with distinct keyword separated by comma
- WHERE <condition list>: It is an optional argument specified with SQL Select distinct clause to fetch records based on condition
- ORDER BY <expression>: it is an optional argument specified with SQL Select distinct clause to display resulting rows sorted order
SQL Distinct Key Points
- When one expression is provided in the SQL DISTINCT clause then the query will return the unique values of the expressions.
- The query will retrieve the unique combinations for the listed expressions if more than one expression is provided in the SQL DISTINCT clause here.
- In SQL, the DISTINCT clause cannot ignore the NULL values. So when we use the DISTINCT clause in the SQL statement, our result set will include NULL as a distinct value.
SQL Distinct Examples
Consider the School Result management database for the practical examples:
Student Table
student_id | studentname | admissionno | admissiondate | enrollmentno | date_of_birth | city | class_id | |
---|---|---|---|---|---|---|---|---|
101 | reema | 10001 | 02-02-2000 | e15200002 | 02-02-1990 | reema@gmail.com | surat | 2 |
102 | kriya | 10002 | 04-05-2001 | e16200003 | 04-08-1991 | kriya@gmail.com | surat | 1 |
103 | meena | 10003 | 06-05-1999 | e15200004 | 02-09-1989 | meena@gmail.com | vadodara | 3 |
104 | carlin | 2001 | 04-01-1998 | e14200001 | 04-04-1989 | carli@gmail.com | vapi | 1 |
105 | dhiren | 2002 | 02-02-1997 | e13400002 | 02-02-1987 | dhiru@gmail.com | vapi | 2 |
106 | hiren | 2003 | 01-01-1997 | e13400001 | 03-03-1887 | hiren@gmail.com | surat | 2 |
107 | mahir | 10004 | 06-09-2000 | e15200003 | 07-09-1990 | mahi@gmail.com | vapi | 3 |
108 | nishi | 2004 | 02-04-2001 | e16200001 | 03-02-1991 | nishi@gmail.com | vadodara | 1 |
Result Table
result_id | student_id | examname | examdate | subject | obtainmark | totalmarks | percentage | grade | status |
---|---|---|---|---|---|---|---|---|---|
3001 | 101 | sem1 | 07-08-2001 | 1 | 80 | 100 | 80 | A+ | pass |
3002 | 101 | sem1 | 08-08-2001 | 2 | 76 | 100 | 76 | A+ | pass |
3003 | 102 | sem3 | 05-05-2000 | 3 | 67 | 100 | 67 | A | pass |
3004 | 102 | sem3 | 06-05-2000 | 4 | 89 | 100 | 89 | A+ | pass |
3005 | 102 | sem3 | 07-05-2000 | 5 | 90 | 100 | 90 | A+ | pass |
3006 | 103 | sem5 | 08-09-1998 | 6 | 55 | 100 | 55 | B | pass |
3007 | 103 | sem5 | 09-09-1998 | 7 | 30 | 100 | 30 | D | fail |
3008 | 103 | sem5 | 10-09-1998 | 8 | 34 | 100 | 34 | D | fail |
Example-1: SQL Distinct on one column
To Retrieve the distinctive names of cities of students with SQL Select statement
SELECT DISTINCT city
FROM tblstudent
- In an above select query, the SQL Distinct keyword is applied before the column name city to fetch the unique value of the city name in the student table
- If we analyzed the values of city column in the student table, the name of cities like Surat, Vapi and Vadodara is repeated in the records, after applied the Distinct keyword the resulting set contains only different values
OUTPUT:
Example-2: SQL Distinct on multiple columns
We can apply SQL Distinct on multiple columns separated by a comma with SQL Select statement to fetch unique values of multiple columns in a single SQL query, if we use DISTINCT clause for multiple columns at once, SQL will consider all the supplied columns as one and return a row if it is distinct. A combination of all columns will be considered for distinct records and not individual columns.
Display unique admission date of students with their city name
SELECT DISTINCT admissiondate, city
FROM tblstudent
- This query returns all the distinct records of admisiondate and city distinct entries. Here the Distinct query will consider both the columns and return all the distinct columns against the other column entry
- The output contains unique combination rows of both column values
OUTPUT:
Example-3: SQL Distinct on multiple columns with Order By
SQL DISTINCT “order by” can be used for sorting using one or multiple columns. Use SQL Distinct to find the unique entry of combination of admissiondate, city and classid
SELECT DISTINCT admissiondate, city, class_id
FROM tblstudent
ORDER BY city
- The above query returns all the distinct records of admisiondate, city, and class_id in the sorting order of city name. Here the Distinct query will consider three the columns and return all the distinct columns against the other column entry
- The output contains a unique combination of specified column values in ascending order of city name
OUTPUT:
Example-4: SQL Distinct with Where conditional statement
SQL Where clause in the select statement with SQL Distinct keyword is used to conditionally fetch eliminate duplicate records in resulting rows
Use SQL distinct and where clause to display unique records of students who have taken admission after the year 1996
SELECT DISTINCT admissiondate, date_of_birth, city, class_id
FROM tblstudent
WHERE (YEAR(admissiondate) > 1996)
OUTPUT:
Example-5: SQL Select Distinct on SQL Joins
Select Distinct keyword is used to retrieve each combination of column values from more than one tables to be returned only one time , we can use Inner join, Outer join or full join to fetch combined unique entries of particular column values
To retrieve each students distinctive result details use SQL SELECT Distinct
SELECT DISTINCT tblresult.student_id, tblstudent.studentname, tblresult.examname, tblresult.examdate, tblresult.obtainmark, tblresult.totalmarks, tblresult.pecentage, tblresult.grade, tblresult.status, tblstudent.email, tblstudent.city
FROM tblresult INNER JOIN
tblstudent ON tblresult.student_id = tblstudent.student_id
- In above SQL query, Inner join is applied to join both tables on common column conditions using ON keyword
- SQL Distinct keyword is applied before the column name list to fetch a unique record for each combination of the column list
OUTPUT:
Difference between SQL Distinct and Group By
- DISTINCT is used to filter unique records out of all records in the table. It removes the duplicate rows, SELECT DISTINCT will always be the same, or faster than a GROUP BY
- GROUP BY operator is meant for the aggregating or grouping rows whereas DISTINCT is just used to get distinct values
Example 6: In the following example while applying a DISTINCT and GROUP BY clause on StudentName Column. It will produce the same outputs
SELECT DISTINCT studentname
FROM tblstudent
SELECT studentname
FROM tblstudent
GROUP BY studentname
- The above example, SQL Distinct is used in the first query to retrieve a unique student name, the same result can produce using SQL Group by
- In the second query of SQL Select statement, Group by clause has been applied on student name column to display only unique value of studentname
OUTPUT:
SQL Distinct with aggregate function
- DISTINCT in the SELECT clause and DISTINCT in an aggregate function does not return the same result.
- Including a non-DISTINCT aggregate function and a DISTINCT aggregate function in the same SELECT clause can produce misleading results. Either all of the aggregate functions or none should be used with DISTINCT in the SELECT clause.
- SQL Distinct cannot be used with COUNT(*), With SUM(), AVG(), and COUNT(expr), DISTINCT eliminates duplicate values before the sum, average, or count is calculated. If SQL DISTINCT is used with MIN() it won’t change the result. You can’t use DISTINCT with COUNT(*).
- In Microsoft SQL Server, if you use DISTINCT, expr must be a column name only. It can’t include an arithmetic expression
Example 7 : Calculate the total number of cities of students, without including the same city name twice
SELECT COUNT(DISTINCT city) AS 'total city'
FROM tblstudent
OUTPUT:
SELECT COUNT(city) AS 'total city'
FROM tblstudent
OUTPUT:
- In above both query SQL count aggregate function is used to calculate total cities of students belonging,
- In the first query, SQL distinct is applied with the column name city as an argument in the COUNT function, so the resulting value of this query is 3 which means that only the unique value of the city name is counted
- In the second query, SQL distinct keyword is not used in the argument of count function so city name’s duplicate value is counted, the resulting value is 8
Difference between SQL Distinct and Unique
The UNIQUE keyword in SQL plays the role of a database constraint; it ensures there are no duplicate values stored in a particular column or a set of columns. On the other hand, the DISTINCT keyword is used in the SELECT statement to fetch distinct rows from a table
Example 8 : Write SQL query to fetch unique record of each semester students
SELECT DISTINCT tblresult.examname, tblstudent.studentname
FROM tblresult INNER JOIN
tblstudent ON tblresult.student_id = tblstudent.student_id
- Above SQL select query, to retrieve unique records of each semester student details from student table SQL Distinct is applied with examname column of the result table
- As both columns examname and studentname are belongs to two different tables SQL inner join is applied to join both tables
OUTPUT:
Example of SQL unique constraint on examname column
CREATE TABLE student_result
(
result_id int PRIMARY KEY,
examname varchar(10) UNIQUE,
student_id int,
examdate date
)
Summary
In this article, we have covered an overview of SQL Distinct with some key points to remember for using SQL Distinct with SQL select statement, on one column, and on multiple columns, the difference between SQL Distinct and Group By clause and SQL Distinct with SQL Unique constrain and SQL distinct with Order By has explained with Practical examples
References
Further Reading