MYSQL Remove Duplicates

MYSQL Remove duplicate Rows

Removing duplicate rows with mySQL could be something you need to do for various reasons. One reason could be that you want to clean up your database and have unique entries. The script below is written in PHP / mySQL and can be used to remove those unwanted entries.

First, I will show the code. For those with decent PHP / mySQL skills you can just edit the mySQL statements so that they find, count and delete the duplicates from your desired table. For those who need to understand what is going on, you can see the explanation about removing the duplicate rows with mySQL.

<?php 
 include('connect.inc');
 $db = public_db_connect();
if($_POST['submit']) {
 
$command = "SELECT id, firstname, lastname, count(*) as cnt from table_sort group by firstname, lastname having count(*)>'1' order by cnt DESC ";
 
$result = mysqli_query($db, $command);
while($row = mysqli_fetch_assoc($result)) {
$firstname = $row['firstname'];
$lastname = $row['lastname'];
$cnt = $row['cnt'];
 
$firstname_array[] = $firstname;
$lastname_array[] = $lastname;
$mix[] = $firstname.",".$lastname.",".$cnt;
} 

$mix = array_unique($mix);
 
foreach ($mix as $item) {
$both = explode(",",$item);
$myfirst = $both[0];
$mylast = $both[1];
$number = $both[2];
$delete_amount = $number - 1;
 
$commandb = "DELETE FROM table_sort WHERE firstname='$myfirst' AND lastname ='$mylast' LIMIT $delete_amount ";
$resultb = mysqli_query($db, $commandb);
if($resultb){
echo "$myfirst - $mylast deleted Successfully!<br/>";
}
 
} 
}
?>
<div style="display:block; float:left; width:45%;">
 
<div style="float:left; width: 150px;">Check Names<br/><br/></div>
</div><div style="clear:both;"></div>
<div><form action="<?php echo $_SERVER['PHP_SELF'].'?'. $_SERVER['QUERY_STRING'] ; ?>" method="post" style="display:visible;">
<input name="submit" type="submit" value="Delete All Duplicates"></form></div>
</div>


Code Explanation

The file 'connect.inc' is included. It contains a function to connect to the database and return the $db variable which will be used with the mySQL queries. Then, there is a condition that takes after a form is submitted. When the form is submitted, it simply runs the code to remove the duplicate mySQL rows.

The first query gathers the rows which have a duplicate first and lastname. So, the id, firstname, lastname and count of rows is gathered in the query. In simpler terms, all rows of duplicate data are retrieved.

The mix[] array will be the duplicates. The mix array is a new comma separated string which has every duplicate row. It contains the firstname, lastname and amount of duplicates. So, if there are two rows with John Smith, the $row['cnt'] would be the same for each of them. In fact, the entire comma separated string would be the same.

After the mix array is built from the while loop, a unique mix array is made to remove any duplicates from the array. Then, a foreach loop is run and iterates over each and every value from theb mix array. The explode() function is used to create another array from the comma separated value that was stored in the mix array. This allows you to have the firstname, lastname and count of duplicates. The firstname takes on the string called $myfirst while the lastname would be $mylast and the amount of duplicates would be $number.

Now, the delete query removes all the duplicate mySQL rows except one since $delete_amount is equal to the number of duplicate rows minus 1. Thus, if there are 3 duplicate rows, only 2 will be removed.

The HTML at the bottom of the code block starting with <div style="display:block is the HTML code that shows a form so that the duplicates can be removed.

After Removing Duplicates

Once you remove the duplicates, you may want to make a custom index so that the table will not insert any entries with duplicate firstnames and lastnames. The code below shows how to do this with a simple mySQL query.

alter table table_sort add unique index(firstname, lastname);