Problem:
Assume you have a table with structure as follows.
Comment [id, app_id, reviewer_id,review_id ....] where id is primary key.
And you have to remove duplicated records with the same (app_id, and review_id).
Here are some solutions to fulfill your requirement.
Solution 1: [For small size of table]
1) Find out which records are duplicated.
SELECT id FROM (SELECT COUNT(*) AS num, id, app_id, review_id FROM Comment GROUP BY app_id,review_id) AS result WHERE result.num>2;
NOTE: the above sql statement will list one of duplicated id;
2) Discard duplicated records
DELETE FROM Comment WHERE id in (SELECT id FROM (SELECT COUNT(*) AS num, id, app_id, review_id FROM Comment GROUP BY app_id,review_id) AS result WHERE result.num>2)
Solution 2: [Once for all]
ALTER IGNORE TABLE Comment ADD UNIQUE INDEX idx_name (app_id,review_id);
This solution works for me so that I did not examine its limitation in some extreme cases.
The credit of this solution goes to
http://stackoverflow.com/a/3312066 .
No comments:
Post a Comment