[] 千万级的表如何去重复
[求助] 千万级的表怎么去重复?
一直都是在折腾万级别的小小数据库,不知道索引、数据类型等的不同会对效率有多大影响。最近不是密码 泄露吗?就下了个,导入mysql数据库,共两千多万条记录,只留密码字段,其他字段全部删除,进行select、insert等测试,有了索引select的效率明显不同,但在去重复时遇到难题。
方法一:
CREATE TABLE newtable SELECT DISTINCT pwd FROM oldtable
这种方式看起来效率最高,但运行时直接把机器拖死,内存一会儿就用完了。
方法二:
逐条获取再删除重复(每次提取$num条记录,我的$num=50)
$result = mysql_query("SELECT MIN(id), pwd FROM tablename WHERE id BETWEEN $id AND $num GROUP BY pwd");
while($row = mysql_fetch_row($result)){
mysql_query("DELETE FROM tablename WHERE id>$row[0] AND pwd='$row[1]'");
}
$id += $num;
再通过地址栏或cookie等传递$id,效率太低,处理了100分钟,才删除了30多万条重复
请问我应该怎么做,效率才会更高?谢谢
------解决方案--------------------
创建临时表方法好
之前一般建议别人这样操作,但不一定能听进去,小数据量倒无所谓
http://topic.csdn.net/u/20111225/22/7cabedc3-5e9e-42b3-b05b-153ba5a5a67f.html
操作时候占资源是必须的,,不可避免。。。。。除非你乐意慢慢等待
------解决方案--------------------
2100w,不知道加unique效率如何,你可试下
- SQL code
alter ignore table mypwd add unique(pwd); alter table mypwd drop index pwd; <br><font color="#e78608">------解决方案--------------------</font><br>用临时表吧。create temporary table .... <br><font color="#e78608">------解决方案--------------------</font><br>试试:<br><br>新建表,设定唯一字段。<br>导出sql文件。 <br>重新source导入. <br><font color="#e78608">------解决方案--------------------</font><br>你可以建唯一键。不要索引。 重复直接报错忽略。<br><br>select内存不够进,仍要存盘。 而且有distinct. 还要对比重复。 应没有source快。 <br><font color="#e78608">------解决方案--------------------</font><br>

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics

MySQL is an open source relational database management system, mainly used to store and retrieve data quickly and reliably. Its working principle includes client requests, query resolution, execution of queries and return results. Examples of usage include creating tables, inserting and querying data, and advanced features such as JOIN operations. Common errors involve SQL syntax, data types, and permissions, and optimization suggestions include the use of indexes, optimized queries, and partitioning of tables.

You can open phpMyAdmin through the following steps: 1. Log in to the website control panel; 2. Find and click the phpMyAdmin icon; 3. Enter MySQL credentials; 4. Click "Login".

MySQL's position in databases and programming is very important. It is an open source relational database management system that is widely used in various application scenarios. 1) MySQL provides efficient data storage, organization and retrieval functions, supporting Web, mobile and enterprise-level systems. 2) It uses a client-server architecture, supports multiple storage engines and index optimization. 3) Basic usages include creating tables and inserting data, and advanced usages involve multi-table JOINs and complex queries. 4) Frequently asked questions such as SQL syntax errors and performance issues can be debugged through the EXPLAIN command and slow query log. 5) Performance optimization methods include rational use of indexes, optimized query and use of caches. Best practices include using transactions and PreparedStatemen

MySQL is chosen for its performance, reliability, ease of use, and community support. 1.MySQL provides efficient data storage and retrieval functions, supporting multiple data types and advanced query operations. 2. Adopt client-server architecture and multiple storage engines to support transaction and query optimization. 3. Easy to use, supports a variety of operating systems and programming languages. 4. Have strong community support and provide rich resources and solutions.

Apache connects to a database requires the following steps: Install the database driver. Configure the web.xml file to create a connection pool. Create a JDBC data source and specify the connection settings. Use the JDBC API to access the database from Java code, including getting connections, creating statements, binding parameters, executing queries or updates, and processing results.

The process of starting MySQL in Docker consists of the following steps: Pull the MySQL image to create and start the container, set the root user password, and map the port verification connection Create the database and the user grants all permissions to the database

The main role of MySQL in web applications is to store and manage data. 1.MySQL efficiently processes user information, product catalogs, transaction records and other data. 2. Through SQL query, developers can extract information from the database to generate dynamic content. 3.MySQL works based on the client-server model to ensure acceptable query speed.

Laravel is a PHP framework for easy building of web applications. It provides a range of powerful features including: Installation: Install the Laravel CLI globally with Composer and create applications in the project directory. Routing: Define the relationship between the URL and the handler in routes/web.php. View: Create a view in resources/views to render the application's interface. Database Integration: Provides out-of-the-box integration with databases such as MySQL and uses migration to create and modify tables. Model and Controller: The model represents the database entity and the controller processes HTTP requests.
