I was recently trying to add a unique index on a table, and it gave me an error that the index could not be created because the table contained duplicate records across the fields I was trying to use in my unique index. As a result I found a need to remove duplicate records from my table, and decided to share the code I used to do so below.
The following SQL can be used to remove duplicates from any DB2 table. You’ll want to identify the columns in the table that you need to be unique as they’ll be used as the partition columns in the statement.
The following variables are used in the SQL shown below and will need to be replaced prior to executing the SQL:
- ${TABLE} : The table to remove duplicate records from.
- ${COL1} : The first column by which you wish to partition your table.
- ${COL2} : The second column by which you wish to partition your table.
NOTE: The example below has been designed to handle a partition over two columns; however, you could easily increase or decrease the number of columns accordingly.
DELETE FROM (SELECT ROWNUMBER() OVER (PARTITION BY ${COL1}, ${COL2}) AS ROWNUM FROM ${TABLE}) AS TMP WHERE ROWNUM > 1;
NOTE: This query is designed to be run in DB2 v9. I have not tested it to see if it will run on prior versions of DB2.
Veru useful info….Excellent
Hi
I tried using your query. But I get following error. I am using db2 z/os version 11. Can you help me out?
delete from (select rownumber() over (partition by EMPNAME) as ROWNUM FROM EMPTABLE) AS TEMP WHERE ROWNUM > 1;
IBM DATA STUDIO OUTPUT BELOW:-
ILLEGAL SYMBOL “(“. SOME SYMBOLS THAT MIGHT BE LEGAL ARE: . SQLCODE=-104, SQLSTATE=42601, DRIVER=4.18.60
QMF OUTPUT BELOW:-
SQL error at or before ( (line 1, position 13).