Monday, 12 November 2018

Delete all Duplicate Rows except for One in MySQL?

How would I delete all duplicate data from a MySQL Table?
For example, with the following data:
SELECT * FROM names;

+----+--------+
| id | name   |
+----+--------+
| 1  | google |
| 2  | yahoo  |
| 3  | msn    |
| 4  | google |
| 5  | google |
| 6  | yahoo  |
+----+--------+
I would use SELECT DISTINCT name FROM names; if it were a SELECT query.
How would I do this with DELETE to only remove duplicates and keep just one record of each?

 Answers


NB - You need to do this first on a test copy of your table!
When I did it, I found that unless I also included AND n1.id <> n2.id,
 it deleted every row in the table.
1) If you want to keep the row with the lowest id value:
DELETE n1 FROM names n1, names n2 WHERE n1.id > n2.id AND n1.name = n2.name
2) If you want to keep the row with the highest id value:
DELETE n1 FROM names n1, names n2 WHERE n1.id < n2.id AND n1.name = n2.name
I used this method in MySQL 5.1
Not sure about other versions.
Update: Since people Googling for removing duplicates end up here
Although the OP's question is about DELETE, please be advised that using 
INSERT and DISTINCT is much faster. For a database with 8 million rows, the 
below query took 13 minutes, while using DELETE, it took more than 2 hours 
and yet didn't complete.
INSERT INTO tempTableName(cellId,attributeId,entityRowId,value)
    SELECT DISTINCT cellId,attributeId,entityRowId,value
    FROM tableName;

0 comments:

Post a Comment