Find Duplicate Data in a MySQL Database
Here is a handy little query for finding duplicate data in a MySQL data base. This can be useful for finding duplicate data, creating a leaderboard, a top ten list, etc. My original use case was to assist in de-duping some large database tables.
SELECT count(some_column), some_column FROM some_table GROUP BY some_column HAVING count(*) > 1;
In the following example I modified the code to return a list of the 10 most common first names among all of the entries in the people table.
SELECT count(first_name) as Count, first_name as Name FROM people GROUP BY first_name HAVING count(*) > 1 ORDER BY COUNT DESC LIMIT 10;
The results will look like the following.
+-------+---------+
| Count | Name |
+-------+---------+
| 29 | Michael |
| 24 | John |
| 18 | James |
| 18 | David |
| 16 | Robert |
| 11 | Jason |
| 11 | William |
| 9 | Charles |
| 9 | Anthony |
| 8 | Eric |
+-------+---------+
10 rows in set (0.01 sec)