"Filter" huge MariaDB/Mysql tables based on different tables
P粉287345251
P粉287345251 2023-09-07 10:51:26
0
2
607

Working with large datasets in my mariaDB database. I have two tables, Table A contains 57 million rows and Table B contains about 500 rows. Table B is a subset of ids related to columns in Table A. I want to delete all rows in A table B.

Example table A:

classification_id Name
20 Mercedes
30 Kawasaki
80 Leitz
70 HP

Example Table B:

classification_id type
20 car
30 bike
40 the bus
50 Boat

So, in this example, the last two rows in table A will be deleted (or a mirrored table will be created containing only the first two rows, which is also OK).

I tried executing a second query using an inner join, but this query took a few minutes and gave an out of memory exception.

Any suggestions on how to resolve this issue?

P粉287345251
P粉287345251

reply all(2)
P粉258083432

Since you say the filtered table contains a relatively small number of rows, your best option is to create a separate table with the same columns as the original table A and rows that match your criteria , then replace the original table and delete it. Also, with this number of IDs, you may want to use WHERE IN () instead of a join - as long as the fields you use there are indexed, it will usually be way em> faster. Putting it all together:

CREATE TABLE new_A AS
  SELECT A.* FROM A 
  WHERE classification_id IN (SELECT classification_id FROM B);
RENAME TABLE A TO old_A, new_A to A;
DROP TABLE old_A;

Things to note:

  • Back up your data! And test the query thoroughly before running DROP TABLE. You don’t want to lose 57M rows of data because of random answers on StackOverflow.
  • If A has any indexes or foreign keys, these will not be copied - so you must recreate them manually. I recommend running SHOW CREATE TABLE A first and taking note of its structure. Alternatively, you might consider using the output of SHOW CREATE TABLE A as a template to explicitly create table new_A, and then do INSERT INTO new_A SELECT ... code> instead of CREATE TABLE new_A AS SELECT ... and use the same query after that.
P粉155832941

Try this:

delete from "table A" where classification_id not in (select classification_id from "table B");
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template