


Detailed explanation of MongoDB performance issues when migrating from MySQL to MongoDB
最近忙着把一个项目从MySQL迁移到MongoDB,在导入旧数据的过程中,遇到了些许波折,犯了不少错误,但同时也学到了不少知识,遂记录下来,需要的朋友可以参考下
公司为这个项目专门配备了几台高性能务器,清一色的双路四核超线程CPU,外加32G内存,运维人员安装好MongoDB后,就交我手里了,我习惯于在使用新服务器前先看看相关日志,了解一下基本情况,当我浏览MongoDB日志时,发现一些警告信息:
WARNING: You are running on a NUMA machine. We suggest launching mongod like this to avoid performance problems: numactl –interleave=all mongod [other options]
当时我并不太清楚NUMA是什么东西,所以没有处理,只是把问题反馈给了运维人员,后来知道运维人员也没有理会这茬儿,所以问题的序幕就这样拉开了。
迁移工作需要导入旧数据。MongoDB本身有一个mongoimport工具可供使用,不过它只接受json、csv等格式的源文件,不适合我的需求,所以我没用,而是用PHP写了一个脚本,平稳运行了一段时间后,我发现数据导入的速度下降了,同时PHP抛出异常:
cursor timed out (timeout: 30000, time left: 0:0, status: 0)
我一时判断不出问题所在,想想先在PHP脚本里加大Timeout的值应付一下:
<?php MongoCursor::$timeout = -1; ?>
可惜这样并没有解决问题,错误反倒变着花样的出现了:
max number of retries exhausted, couldn't send query, couldn't send query: Broken pipe
接着使用strace跟踪了一下PHP脚本,发现进程卡在了recvfrom操作上:
shell> strace -f -r -p <PID> recvfrom(<FD>,
通过如下命令查询recvfrom操作的含义:
shell> apropos recvfrom receive a message from a socket
或者按照下面的方式确认一下:
shell> lsof -p <PID> shell> ls -l /proc/<PID>/fd/<FD>
此时如果查询MongoDB的当前操作,会发现几乎每个操作会消耗大量的时间:
mongo> db.currentOp()
与此同时,运行mongostat的话,结果会显示很高的locked值。
…
我在网络上找到一篇:MongoDB Pre-Splitting for Faster Data Loading and Importing,看上去和我的问题很类似,不过他的问题实质是由于自动分片导致数据迁移所致,解决方法是使用手动分片,而我并没有使用自动分片,自然不是这个原因。
…
询问了几个朋友,有人反映曾遇到过类似的问题,在他的场景里,问题的主要原因是系统IO操作繁忙时,数据文件预分配堵塞了其它操作,从而导致雪崩效应。
为了验证这种可能,我搜索了一下MongoDB日志:
shell> grep FileAllocator /path/to/log [FileAllocator] allocating new datafile ... filling with zeroes... [FileAllocator] done allocating datafile ... took ... secs
我使用的文件系统是ext4(xfs也不错 ),创建数据文件非常快,所以不是这个原因,但如果有人使用ext3,可能会遇到这类问题,所以还是大概介绍一下如何解决:
MongoDB按需自动生成数据文件:先是
#!/bin/sh DB_NAME=$1 cd /path/to/$DB_NAME for INDEX_NUMBER in {5..50}; do FILE_NAME=$DB_NAME.$INDEX_NUMBER if [ ! -e $FILE_NAME ]; then head -c 2146435072 /dev/zero > $FILE_NAME fi done
注:数值2146435072并不是标准的2G,这是INT整数范围决定的。
…
最后一个求助方式就是官方论坛了,那里的国际友人建议我检查一下是不是索引不佳所致,死马当活马医,我激活了Profiler记录慢操作:
mongo> use <DB> mongo> db.setProfilingLevel(1);
不过结果显示基本都是insert操作(因为我是导入数据为主),本身就不需要索引:
mongo> use <DB> mongo> db.system.profile.find().sort({$natural:-1})
…
问题始终没有得到解决,求人不如求己,我又重复了几次迁移旧数据的过程,结果自然还是老样子,但我发现每当出问题的时候,总有一个名叫irqbalance的进程CPU占用率居高不下,搜索了一下,发现很多介绍irqbalance的文章中都提及了NUMA,让我一下子想起之前在日志中看到的警告信息,我勒个去,竟然绕了这么大一个圈圈!安下心来仔细翻阅文档,发现官方其实已经有了相关介绍,按如下设置搞定:
shell> echo 0 > /proc/sys/vm/zone_reclaim_mode shell> numactl --interleave=all mongod [options]
关于zone_reclaim_mode内核参数的说明,可以参考官方文档。
注:从MongoDB1.9.2开始:MongoDB会在启动时自动设置zone_reclaim_mode。
至于NUMA的含义,简单点说,在有多个物理CPU的架构下,NUMA把内存分为本地和远程,每个物理CPU都有属于自己的本地内存,访问本地内存速度快于访问远程内存,缺省情况下,每个物理CPU只能访问属于自己的本地内存。对于MongoDB这种需要大内存的服务来说就可能造成内存不足,NUMA的详细介绍,可以参考老外的文章。
Theoretically, MySQL, Redis, Memcached, etc. may be affected by NUMA and need to be paid attention to.
The above is the detailed content of Detailed explanation of MongoDB performance issues when migrating from MySQL to MongoDB. For more information, please follow other related articles on the PHP Chinese website!

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics



MySQL's position in databases and programming is very important. It is an open source relational database management system that is widely used in various application scenarios. 1) MySQL provides efficient data storage, organization and retrieval functions, supporting Web, mobile and enterprise-level systems. 2) It uses a client-server architecture, supports multiple storage engines and index optimization. 3) Basic usages include creating tables and inserting data, and advanced usages involve multi-table JOINs and complex queries. 4) Frequently asked questions such as SQL syntax errors and performance issues can be debugged through the EXPLAIN command and slow query log. 5) Performance optimization methods include rational use of indexes, optimized query and use of caches. Best practices include using transactions and PreparedStatemen

Apache connects to a database requires the following steps: Install the database driver. Configure the web.xml file to create a connection pool. Create a JDBC data source and specify the connection settings. Use the JDBC API to access the database from Java code, including getting connections, creating statements, binding parameters, executing queries or updates, and processing results.

The process of starting MySQL in Docker consists of the following steps: Pull the MySQL image to create and start the container, set the root user password, and map the port verification connection Create the database and the user grants all permissions to the database

Installing MySQL on CentOS involves the following steps: Adding the appropriate MySQL yum source. Execute the yum install mysql-server command to install the MySQL server. Use the mysql_secure_installation command to make security settings, such as setting the root user password. Customize the MySQL configuration file as needed. Tune MySQL parameters and optimize databases for performance.

Detailed explanation of MongoDB efficient backup strategy under CentOS system This article will introduce in detail the various strategies for implementing MongoDB backup on CentOS system to ensure data security and business continuity. We will cover manual backups, timed backups, automated script backups, and backup methods in Docker container environments, and provide best practices for backup file management. Manual backup: Use the mongodump command to perform manual full backup, for example: mongodump-hlocalhost:27017-u username-p password-d database name-o/backup directory This command will export the data and metadata of the specified database to the specified backup directory.

Encrypting MongoDB database on a Debian system requires following the following steps: Step 1: Install MongoDB First, make sure your Debian system has MongoDB installed. If not, please refer to the official MongoDB document for installation: https://docs.mongodb.com/manual/tutorial/install-mongodb-on-debian/Step 2: Generate the encryption key file Create a file containing the encryption key and set the correct permissions: ddif=/dev/urandomof=/etc/mongodb-keyfilebs=512

The key to installing MySQL elegantly is to add the official MySQL repository. The specific steps are as follows: Download the MySQL official GPG key to prevent phishing attacks. Add MySQL repository file: rpm -Uvh https://dev.mysql.com/get/mysql80-community-release-el7-3.noarch.rpm Update yum repository cache: yum update installation MySQL: yum install mysql-server startup MySQL service: systemctl start mysqld set up booting

Sorting index is a type of MongoDB index that allows sorting documents in a collection by specific fields. Creating a sort index allows you to quickly sort query results without additional sorting operations. Advantages include quick sorting, override queries, and on-demand sorting. The syntax is db.collection.createIndex({ field: <sort order> }), where <sort order> is 1 (ascending order) or -1 (descending order). You can also create multi-field sorting indexes that sort multiple fields.
