COM ESBORRAR FILES DUPLICADES EN SQL

En aquesta secció, aprenem diferents maneres d'eliminar files duplicades MySQL i Oracle . Si el SQL La taula conté files duplicades, llavors hem d'eliminar les files duplicades.

Preparació de dades de mostra

L'script crea la taula anomenada contactes .

 DROP TABLE IF EXISTS contacts; CREATE TABLE contacts ( id INT PRIMARY KEY AUTO_INCREMENT, first_name VARCHAR(30) NOT NULL, last_name VARCHAR(25) NOT NULL, email VARCHAR(210) NOT NULL, age VARCHAR(22) NOT NULL );

A la taula anterior, hem inserit les dades següents.

 INSERT INTO contacts (first_name,last_name,email,age) VALUES (&apos;Kavin&apos;,&apos;Peterson&apos;,&apos;[email protected]&apos;,&apos;21&apos;), (&apos;Nick&apos;,&apos;Jonas&apos;,&apos;[email protected]&apos;,&apos;18&apos;), (&apos;Peter&apos;,&apos;Heaven&apos;,&apos;[email protected]&apos;,&apos;23&apos;), (&apos;Michal&apos;,&apos;Jackson&apos;,&apos;[email protected]&apos;,&apos;22&apos;), (&apos;Sean&apos;,&apos;Bean&apos;,&apos;[email protected]&apos;,&apos;23&apos;), (&apos;Tom &apos;,&apos;Baker&apos;,&apos;[email protected]&apos;,&apos;20&apos;), (&apos;Ben&apos;,&apos;Barnes&apos;,&apos;[email protected]&apos;,&apos;17&apos;), (&apos;Mischa &apos;,&apos;Barton&apos;,&apos;[email protected]&apos;,&apos;18&apos;), (&apos;Sean&apos;,&apos;Bean&apos;,&apos;[email protected]&apos;,&apos;16&apos;), (&apos;Eliza&apos;,&apos;Bennett&apos;,&apos;[email protected]&apos;,&apos;25&apos;), (&apos;Michal&apos;,&apos;Krane&apos;,&apos;[email protected]&apos;,&apos;25&apos;), (&apos;Peter&apos;,&apos;Heaven&apos;,&apos;[email protected]&apos;,&apos;20&apos;), (&apos;Brian&apos;,&apos;Blessed&apos;,&apos;[email protected]&apos;,&apos;20&apos;); (&apos;Kavin&apos;,&apos;Peterson&apos;,&apos;[email protected]&apos;,&apos;30&apos;),

Executem l'script per recrear les dades de prova després d'executar a ELIMINAR declaració .

La consulta retorna dades de la taula de contactes:

 SELECT * FROM contacts ORDER BY email;

id	nom	cognom	Correu electrònic	edat
7	Ben	Barnes	[correu electrònic protegit]	21
13	Brian	Beneït	[correu electrònic protegit]	18
10	Eliza	Bennett	[correu electrònic protegit]	23
1	Kavin	Peterson	[correu electrònic protegit]	22
14	Kavin	Peterson	[correu electrònic protegit]	23
8	Misha	Barton	[correu electrònic protegit]	20
11	Miquel	Aixetes	[correu electrònic protegit]	17
4	Miquel	Jackson	[correu electrònic protegit]	18
2	efecte	Jonas	[correu electrònic protegit]	16
3	Pere	El cel	[correu electrònic protegit]	25
12	Pere	El cel	[correu electrònic protegit]	25
5	Siguin	Mongeta	[correu electrònic protegit]	20
9	Siguin	Mongeta	[correu electrònic protegit]	20
6	Tom	forner	[correu electrònic protegit]	30

La consulta SQL següent retorna els correus electrònics duplicats de la taula de contactes:

 SELECT email, COUNT(email) FROM contacts GROUP BY email HAVING COUNT (email) &gt; 1;

correu electrònic	COUNT (correu electrònic)
[correu electrònic protegit]	2
[correu electrònic protegit]	2
[correu electrònic protegit]	2

Tenim tres files amb duplicar correus electrònics.

"Quina diferència hi ha entre un lleó i un tigre"

(A) Suprimiu les files duplicades amb la instrucció DELETE JOIN

 DELETE t1 FROM contacts t1 INNERJOIN contacts t2 WHERE t1.id <t2.id and t1.email="t2.email;" < pre> <p> <strong>Output:</strong> </p> <pre> Query OK, three rows affected (0.10 sec) </pre> <p>Three rows had been deleted. We execute the query, given below to finds the <strong>duplicate emails</strong> from the table.</p> <pre> SELECT email, COUNT (email) FROM contacts GROUP BY email HAVING COUNT (email) &gt; 1; </pre> <p>The query returns the empty set. To verify the data from the contacts table, execute the following SQL query:</p> <pre> SELECT * FROM contacts; </pre> <br> <table class="table"> <tr> <td>id</td> <td>first_name</td> <td>last_name</td> <td>Email</td> <td>age</td> </tr> <tr> <td>7</td> <td>Ben</td> <td>Barnes</td> <td> [email protected] </td> <td>21</td> </tr> <tr> <td>13</td> <td>Brian</td> <td>Blessed</td> <td> [email protected] </td> <td>18</td> </tr> <tr> <td>10</td> <td>Eliza</td> <td>Bennett</td> <td> [email protected] </td> <td>23</td> </tr> <tr> <td>1</td> <td>Kavin</td> <td>Peterson</td> <td> [email protected] </td> <td>22</td> </tr> <tr> <td>8</td> <td>Mischa</td> <td>Barton</td> <td> [email protected] </td> <td>20</td> </tr> <tr> <td>11</td> <td>Micha</td> <td>Krane</td> <td> [email protected] </td> <td>17</td> </tr> <tr> <td>4</td> <td>Michal</td> <td>Jackson</td> <td> [email protected] </td> <td>18</td> </tr> <tr> <td>2</td> <td>Nick</td> <td>Jonas</td> <td> [email protected] </td> <td>16</td> </tr> <tr> <td>3</td> <td>Peter</td> <td>Heaven</td> <td> [email protected] </td> <td>25</td> </tr> <tr> <td>5</td> <td>Sean</td> <td>Bean</td> <td> [email protected] </td> <td>20</td> </tr> <tr> <td>6</td> <td>Tom</td> <td>Baker</td> <td> [email protected] </td> <td>30</td> </tr> </table> <p>The rows <strong>id&apos;s 9, 12, and 14</strong> have been deleted. We use the below statement to delete the duplicate rows:</p> <p>Execute the script for <strong>creating</strong> the contact.</p> <pre> DELETE c1 FROM contacts c1 INNERJ OIN contacts c2 WHERE c1.id &gt; c2.id AND c1.email = c2.email; </pre> <br> <table class="table"> <tr> <td>id</td> <td>first_name</td> <td>last_name</td> <td>email</td> <td>age</td> </tr> <tr> <td>1</td> <td>Ben</td> <td>Barnes</td> <td> [email protected] </td> <td>21</td> </tr> <tr> <td>2</td> <td> <strong>Kavin</strong> </td> <td> <strong>Peterson</strong></td> <td> <strong> [email protected] </strong> </td> <td> <strong>22</strong> </td> </tr> <tr> <td>3</td> <td>Brian</td> <td>Blessed</td> <td> [email protected] </td> <td>18</td> </tr> <tr> <td>4</td> <td>Nick</td> <td>Jonas</td> <td> [email protected] </td> <td>16</td> </tr> <tr> <td>5</td> <td>Michal</td> <td>Krane</td> <td> [email protected] </td> <td>17</td> </tr> <tr> <td>6</td> <td>Eliza</td> <td>Bennett</td> <td> [email protected] </td> <td>23</td> </tr> <tr> <td>7</td> <td>Michal</td> <td>Jackson</td> <td> [email protected] </td> <td>18</td> </tr> <tr> <td>8</td> <td> <strong>Sean</strong> </td> <td> <strong>Bean</strong> </td> <td> <strong> [email protected] </strong> </td> <td> <strong>20</strong> </td> </tr> <tr> <td>9</td> <td>Mischa</td> <td>Barton</td> <td> [email protected] </td> <td>20</td> </tr> <tr> <td>10</td> <td> <strong>Peter</strong> </td> <td> <strong>Heaven</strong> </td> <td> <strong> [email protected] </strong> </td> <td> <strong>25</strong> </td> </tr> <tr> <td>11</td> <td>Tom</td> <td>Baker</td> <td> [email protected] </td> <td>30</td> </tr> </table> <h2>(B) Delete duplicate rows using an intermediate table</h2> <p>To delete a duplicate row by using the intermediate table, follow the steps given below:</p> <p> <strong>Step 1</strong> . Create a new table <strong>structure</strong> , same as the real table:</p> <pre> CREATE TABLE source_copy LIKE source; </pre> <p> <strong>Step 2</strong> . Insert the distinct rows from the original schedule of the database:</p> <pre> INSERT INTO source_copy SELECT * FROM source GROUP BY col; </pre> <p> <strong>Step 3</strong> . Drop the original table and rename the immediate table to the original one.</p> <pre> DROP TABLE source; ALTER TABLE source_copy RENAME TO source; </pre> <p>For example, the following statements delete the <strong>rows</strong> with <strong>duplicate</strong> emails from the contacts table:</p> <pre> -- step 1 CREATE TABLE contacts_temp LIKE contacts; -- step 2 INSERT INTO contacts_temp SELECT * FROM contacts GROUP BY email; -- step 3 DROP TABLE contacts; ALTER TABLE contacts_temp RENAME TO contacts; </pre> <h2>(C) Delete duplicate rows using the ROW_NUMBER() Function</h2> <h4>Note: The ROW_NUMBER() function has been supported since MySQL version 8.02, so we should check our MySQL version before using the function.</h4> <p>The following statement uses the <strong>ROW_NUMBER ()</strong> to assign a sequential integer to every row. If the email is duplicate, the row will higher than one.</p> <pre> SELECT id, email, ROW_NUMBER() OVER (PARTITION BY email ORDER BY email ) AS row_num FROM contacts; </pre> <p>The following SQL query returns <strong>id list</strong> of the duplicate rows:</p> <pre> SELECT id FROM (SELECT id, ROW_NUMBER() OVER ( PARTITION BY email ORDER BY email) AS row_num FROM contacts ) t WHERE row_num&gt; 1; </pre> <p> <strong>Output:</strong> </p> <table class="table"> <tr> <td>id</td> </tr> <tr> <td>9</td> </tr> <tr> <td>12</td> </tr> <tr> <td>14</td> </tr> </table> <h2>Delete Duplicate Records in Oracle</h2> <p>When we found the duplicate records in the table, we had to delete the unwanted copies to keep our data clean and unique. If a table has duplicate rows, we can delete it by using the <strong>DELETE</strong> statement.</p> <p>In the case, we have a column, which is not the part of <strong>group</strong> used to <strong>evaluate</strong> the <strong>duplicate</strong> records in the table.</p> <p>Consider the table given below:</p> <table class="table"> <tr> <td>VEGETABLE_ID</td> <td>VEGETABLE_NAME</td> <td>COLOR</td> </tr> <tr> <td>01</td> <td>Potato</td> <td>Brown</td> </tr> <tr> <td>02</td> <td>Potato</td> <td>Brown</td> </tr> <tr> <td>03</td> <td>Onion</td> <td>Red</td> </tr> <tr> <td>04</td> <td>Onion</td> <td>Red</td> </tr> <tr> <td>05</td> <td>Onion</td> <td>Red</td> </tr> <tr> <td>06</td> <td>Pumpkin</td> <td>Green</td> </tr> <tr> <td>07</td> <td>Pumpkin</td> <td>Yellow</td> </tr> </table> <br> <pre> -- create the vegetable table CREATE TABLE vegetables ( VEGETABLE_ID NUMBER generated BY DEFAULT AS ID ENTITY, VEGETABLE_NAME VARCHAR2(100), color VARCHAR2(20), PRIMARY KEY (VEGETABLE_ID) ); </pre> <br> <pre> -- insert sample rows INSERT INTO vegetables (VEGETABLE_NAME,color) VALUES(&apos;Potato&apos;,&apos;Brown&apos;); INSERT INTO vegetables (VEGETABLE_NAME,color) VALUES(&apos;Potato&apos;,&apos;Brown&apos;); INSERT INTO vegetables (VEGETABLE_NAME,color) VALUES(&apos;Onion&apos;,&apos;Red&apos;); INSERT INTO vegetables (VEGETABLE_NAME,color) VALUES(&apos;Onion&apos;,&apos;Red&apos;); INSERT INTO vegetables (VEGETABLE_NAME,color) VALUES(&apos;Onion&apos;,&apos;Red&apos;); INSERT INTO vegetables (VEGETABLE_NAME,color) VALUES(&apos;Pumpkin&apos;,&apos;Green&apos;); INSERT INTO vegetables (VEGETABLE_NAME,color) VALUES(&apos;Pumpkin&apos;,&apos;Yellow&apos;); </pre> <br> <pre> -- query data from the vegetable table SELECT * FROM vegetables; </pre> <p>Suppose, we want to keep the row with the highest <strong>VEGETABLE_ID</strong> and delete all other copies.</p> <pre> SELECT MAX (VEGETABLE_ID) FROM vegetables GROUP BY VEGETABLE_NAME, color ORDER BY MAX(VEGETABLE_ID); </pre> <br> <table class="table"> <tr> <td>MAX(VEGETABLE_ID)</td> </tr> <tr> <td>2</td> </tr> <tr> <td>5</td> </tr> <tr> <td>6</td> </tr> <tr> <td>7</td> </tr> </table> <p>We use the <strong>DELETE</strong> statement to delete the rows whose values in the <strong>VEGETABLE_ID COLUMN</strong> are not the <strong>highest</strong> .</p> <pre> DELETE FROM vegetables WHERE VEGETABLE_IDNOTIN ( SELECT MAX(VEGETABLE_ID) FROM vegetables GROUP BY VEGETABLE_NAME, color ); </pre> <p>Three rows have been deleted.</p> <pre> SELECT *FROM vegetables; </pre> <br> <table class="table"> <tr> <td>VEGETABLE_ID</td> <td>VEGETABLE_NAME</td> <td>COLOR</td> </tr> <tr> <td> <strong>02</strong> </td> <td>Potato</td> <td>Brown</td> </tr> <tr> <td> <strong>05</strong> </td> <td>Onion</td> <td>Red</td> </tr> <tr> <td> <strong>06</strong> </td> <td>Pumpkin</td> <td>Green</td> </tr> <tr> <td> <strong>07</strong> </td> <td><pumpkin td> <td>Yellow</td> </pumpkin></td></tr> </table> <p>If we want to keep the row with the lowest id, use the <strong>MIN()</strong> function instead of the <strong>MAX()</strong> function.</p> <pre> DELETE FROM vegetables WHERE VEGETABLE_IDNOTIN ( SELECT MIN(VEGETABLE_ID) FROM vegetables GROUP BY VEGETABLE_NAME, color ); </pre> <p>The above method works if we have a column that is not part of the group for evaluating duplicate. If all values in the columns have copies, then we cannot use the <strong>VEGETABLE_ID</strong> column.</p> <p>Let&apos;s drop and create the <strong>vegetable</strong> table with a new structure.</p> <pre> DROP TABLE vegetables; CREATE TABLE vegetables ( VEGETABLE_ID NUMBER, VEGETABLE_NAME VARCHAR2(100), Color VARCHAR2(20) ); </pre> <br> <pre> INSERT INTO vegetables (VEGETABLE_ID,VEGETABLE_NAME,color) VALUES(1,&apos;Potato&apos;,&apos;Brown&apos;); INSERT INTO vegetables (VEGETABLE_ID,VEGETABLE_NAME,color) VALUES(1, &apos;Potato&apos;,&apos;Brown&apos;); INSERT INTO vegetables (VEGETABLE_ID,VEGETABLE_NAME,color)VALUES(2,&apos;Onion&apos;,&apos;Red&apos;); INSERT INTO vegetables (VEGETABLE_ID,VEGETABLE_NAME,color)VALUES(2,&apos;Onion&apos;,&apos;Red&apos;); INSERT INTO vegetables (VEGETABLE_ID,VEGETABLE_NAME,color) VALUES(2,&apos;Onion&apos;,&apos;Red&apos;); INSERT INTO vegetables (VEGETABLE_ID,VEGETABLE_NAME,color) VALUES(3,&apos;Pumpkin&apos;,&apos;Green&apos;); INSERT INTO vegetables (VEGETABLE_ID,VEGETABLE_NAME,color) VALUES(&apos;4,Pumpkin&apos;,&apos;Yellow&apos;); SELECT * FROM vegetables; </pre> <br> <table class="table"> <tr> <td>VEGETABLE_ID</td> <td>VEGETABLE_NAME</td> <td>COLOR</td> </tr> <tr> <td>01</td> <td>Potato</td> <td>Brown</td> </tr> <tr> <td>01</td> <td>Potato</td> <td>Brown</td> </tr> <tr> <td>02</td> <td>Onion</td> <td>Red</td> </tr> <tr> <td>02</td> <td>Onion</td> <td>Red</td> </tr> <tr> <td>02</td> <td>Onion</td> <td>Red</td> </tr> <tr> <td>03</td> <td>Pumpkin</td> <td>Green</td> </tr> <tr> <td>04</td> <td>Pumpkin</td> <td>Yellow</td> </tr> </table> <p>In the vegetable table, the values in all columns <strong>VEGETABLE_ID, VEGETABLE_NAME</strong> , and color have been copied.</p> <p>We can use the <strong>rowid</strong> , a locator that specifies where Oracle stores the row. Because the <strong>rowid</strong> is unique so that we can use it to remove the duplicates rows.</p> <pre> DELETE FROM Vegetables WHERE rowed NOT IN ( SELECT MIN(rowid) FROM vegetables GROUP BY VEGETABLE_ID, VEGETABLE_NAME, color ); </pre> <p>The query verifies the deletion operation:</p> <pre> SELECT * FROM vegetables; </pre> <br> <table class="table"> <tr> <td>VEGETABLE_ID</td> <td>VEGETABLE_NAME</td> <td>COLOR</td> </tr> <tr> <td>01</td> <td>Potato</td> <td>Brown</td> </tr> <tr> <td>02</td> <td>Onion</td> <td>Red</td> </tr> <tr> <td>03</td> <td>Pumpkin</td> <td>Green</td> </tr> <tr> <td>04</td> <td>Pumpkin</td> <td>Yellow</td> </tr> </table> <hr></t2.id>

S'han suprimit tres files. Executem la consulta que es mostra a continuació per trobar el correus electrònics duplicats de la taula.

 SELECT email, COUNT (email) FROM contacts GROUP BY email HAVING COUNT (email) &gt; 1;

La consulta retorna el conjunt buit. Per verificar les dades de la taula de contactes, executeu la consulta SQL següent:

 SELECT * FROM contacts;

id	nom	cognom	Correu electrònic	edat
7	Ben	Barnes	[correu electrònic protegit]	21
13	Brian	Beneït	[correu electrònic protegit]	18
10	Eliza	Bennett	[correu electrònic protegit]	23
1	Kavin	Peterson	[correu electrònic protegit]	22
8	Misha	Barton	[correu electrònic protegit]	20
11	Micha	Aixetes	[correu electrònic protegit]	17
4	Miquel	Jackson	[correu electrònic protegit]	18
2	efecte	Jonas	[correu electrònic protegit]	16
3	Pere	El cel	[correu electrònic protegit]	25
5	Siguin	Mongeta	[correu electrònic protegit]	20
6	Tom	forner	[correu electrònic protegit]	30

Les files id's 9, 12 i 14 han estat esborrats. Utilitzem la instrucció següent per eliminar les files duplicades:

Executeu l'script per a creant el contacte.

 DELETE c1 FROM contacts c1 INNERJ OIN contacts c2 WHERE c1.id &gt; c2.id AND c1.email = c2.email;

id	nom	cognom	correu electrònic	edat
1	Ben	Barnes	[correu electrònic protegit]	21
2	Kavin	Peterson	[correu electrònic protegit]	22
3	Brian	Beneït	[correu electrònic protegit]	18
4	efecte	Jonas	[correu electrònic protegit]	16
5	Miquel	Aixetes	[correu electrònic protegit]	17
6	Eliza	Bennett	[correu electrònic protegit]	23
7	Miquel	Jackson	[correu electrònic protegit]	18
8	Siguin	Mongeta	[correu electrònic protegit]	20
9	Misha	Barton	[correu electrònic protegit]	20
10	Pere	El cel	[correu electrònic protegit]	25
11	Tom	forner	[correu electrònic protegit]	30

(B) Elimina les files duplicades mitjançant una taula intermèdia

Per eliminar una fila duplicada mitjançant la taula intermèdia, seguiu els passos que s'indiquen a continuació:

Pas 1 . Crea una taula nova estructura , igual que la taula real:

 CREATE TABLE source_copy LIKE source;

Pas 2 . Inseriu les diferents files de la programació original de la base de dades:

 INSERT INTO source_copy SELECT * FROM source GROUP BY col;

Pas 3 . Deixeu anar la taula original i canvieu el nom de la taula immediata a l'original.

 DROP TABLE source; ALTER TABLE source_copy RENAME TO source;

Per exemple, les declaracions següents suprimeixen el files amb duplicar correus electrònics de la taula de contactes:

 -- step 1 CREATE TABLE contacts_temp LIKE contacts; -- step 2 INSERT INTO contacts_temp SELECT * FROM contacts GROUP BY email; -- step 3 DROP TABLE contacts; ALTER TABLE contacts_temp RENAME TO contacts;

(C) Suprimiu les files duplicades mitjançant la funció ROW_NUMBER().

Nota: La funció ROW_NUMBER() s'admet des de la versió 8.02 de MySQL, així que hauríem de comprovar la nostra versió de MySQL abans d'utilitzar la funció.

La declaració següent utilitza el ROW_NUMBER () per assignar un nombre enter seqüencial a cada fila. Si el correu electrònic està duplicat, la fila serà superior a un.

 SELECT id, email, ROW_NUMBER() OVER (PARTITION BY email ORDER BY email ) AS row_num FROM contacts;

Torna la consulta SQL següent llista d'identificacions de les files duplicades:

 SELECT id FROM (SELECT id, ROW_NUMBER() OVER ( PARTITION BY email ORDER BY email) AS row_num FROM contacts ) t WHERE row_num&gt; 1;

Sortida:

Suprimeix els registres duplicats a Oracle

Quan vam trobar els registres duplicats a la taula, vam haver d'eliminar les còpies no desitjades per mantenir les nostres dades netes i úniques. Si una taula té files duplicades, podem esborrar-la mitjançant l' ELIMINAR declaració.

En aquest cas, tenim una columna, que no forma part de grup solia avaluar el duplicar registres a la taula.

Considereu la taula que es mostra a continuació:

foreach bucle mecanografiat

VEGETABLE_ID	VEGETABLE_NAME	COLOR
01	Patata	Marró
02	Patata	Marró
03	Ceba	Vermell
04	Ceba	Vermell
05	Ceba	Vermell
06	Carbassa	Verd
07	Carbassa	groc

 -- create the vegetable table CREATE TABLE vegetables ( VEGETABLE_ID NUMBER generated BY DEFAULT AS ID ENTITY, VEGETABLE_NAME VARCHAR2(100), color VARCHAR2(20), PRIMARY KEY (VEGETABLE_ID) );

 -- insert sample rows INSERT INTO vegetables (VEGETABLE_NAME,color) VALUES(&apos;Potato&apos;,&apos;Brown&apos;); INSERT INTO vegetables (VEGETABLE_NAME,color) VALUES(&apos;Potato&apos;,&apos;Brown&apos;); INSERT INTO vegetables (VEGETABLE_NAME,color) VALUES(&apos;Onion&apos;,&apos;Red&apos;); INSERT INTO vegetables (VEGETABLE_NAME,color) VALUES(&apos;Onion&apos;,&apos;Red&apos;); INSERT INTO vegetables (VEGETABLE_NAME,color) VALUES(&apos;Onion&apos;,&apos;Red&apos;); INSERT INTO vegetables (VEGETABLE_NAME,color) VALUES(&apos;Pumpkin&apos;,&apos;Green&apos;); INSERT INTO vegetables (VEGETABLE_NAME,color) VALUES(&apos;Pumpkin&apos;,&apos;Yellow&apos;);

 -- query data from the vegetable table SELECT * FROM vegetables;

Suposem que volem mantenir la fila amb la més alta VEGETABLE_ID i suprimiu totes les altres còpies.

 SELECT MAX (VEGETABLE_ID) FROM vegetables GROUP BY VEGETABLE_NAME, color ORDER BY MAX(VEGETABLE_ID);

MAX(VEGETABLE_ID)

Fem servir el ELIMINAR declaració per suprimir les files els valors de les quals al fitxer COLUMNA VEGETABLE_ID no són els més alt .

 DELETE FROM vegetables WHERE VEGETABLE_IDNOTIN ( SELECT MAX(VEGETABLE_ID) FROM vegetables GROUP BY VEGETABLE_NAME, color );

S'han suprimit tres files.

 SELECT *FROM vegetables;

VEGETABLE_ID	VEGETABLE_NAME	COLOR
02	Patata	Marró
05	Ceba	Vermell
06	Carbassa	Verd
07		groc

Si volem mantenir la fila amb l'identificador més baix, utilitzeu MIN() funció en lloc de la MAX() funció.

 DELETE FROM vegetables WHERE VEGETABLE_IDNOTIN ( SELECT MIN(VEGETABLE_ID) FROM vegetables GROUP BY VEGETABLE_NAME, color );

El mètode anterior funciona si tenim una columna que no forma part del grup per avaluar el duplicat. Si tots els valors de les columnes tenen còpies, no podem utilitzar el VEGETABLE_ID columna.

Anem a deixar anar i crear el vegetal taula amb una nova estructura.

 DROP TABLE vegetables; CREATE TABLE vegetables ( VEGETABLE_ID NUMBER, VEGETABLE_NAME VARCHAR2(100), Color VARCHAR2(20) );

 INSERT INTO vegetables (VEGETABLE_ID,VEGETABLE_NAME,color) VALUES(1,&apos;Potato&apos;,&apos;Brown&apos;); INSERT INTO vegetables (VEGETABLE_ID,VEGETABLE_NAME,color) VALUES(1, &apos;Potato&apos;,&apos;Brown&apos;); INSERT INTO vegetables (VEGETABLE_ID,VEGETABLE_NAME,color)VALUES(2,&apos;Onion&apos;,&apos;Red&apos;); INSERT INTO vegetables (VEGETABLE_ID,VEGETABLE_NAME,color)VALUES(2,&apos;Onion&apos;,&apos;Red&apos;); INSERT INTO vegetables (VEGETABLE_ID,VEGETABLE_NAME,color) VALUES(2,&apos;Onion&apos;,&apos;Red&apos;); INSERT INTO vegetables (VEGETABLE_ID,VEGETABLE_NAME,color) VALUES(3,&apos;Pumpkin&apos;,&apos;Green&apos;); INSERT INTO vegetables (VEGETABLE_ID,VEGETABLE_NAME,color) VALUES(&apos;4,Pumpkin&apos;,&apos;Yellow&apos;); SELECT * FROM vegetables;

VEGETABLE_ID	VEGETABLE_NAME	COLOR
01	Patata	Marró
01	Patata	Marró
02	Ceba	Vermell
02	Ceba	Vermell
02	Ceba	Vermell
03	Carbassa	Verd
04	Carbassa	groc

A la taula vegetal, els valors de totes les columnes VEGETABLE_ID, VEGETABLE_NAME , i el color s'han copiat.

Podem utilitzar el revoltós , un localitzador que especifica on Oracle emmagatzema la fila. Perquè el revoltós és únic perquè puguem utilitzar-lo per eliminar les files duplicades.

 DELETE FROM Vegetables WHERE rowed NOT IN ( SELECT MIN(rowid) FROM vegetables GROUP BY VEGETABLE_ID, VEGETABLE_NAME, color );

La consulta verifica l'operació d'eliminació:

 SELECT * FROM vegetables;

VEGETABLE_ID	VEGETABLE_NAME	COLOR
01	Patata	Marró
02	Ceba	Vermell
03	Carbassa	Verd
04	Carbassa	groc

TechCodeview

Com esborrar files duplicades en SQL?