mongodb numa问题
mongodb在迁移数据时,通常需要导入旧数据。开始一切倒还正常,不过几小时之后,我发现数据导入的速度下降了,同时我的脚本开始不停的抛出异常。我又重复了几次
mongodb在迁移数据时,通常需要导入旧数据。开始一切倒还正常,不过几小时之后,我发现数据导入的速度下降了,同时我的脚本开始不停的抛出异常。我又重复了几次迁移旧数据的过程,结果自然还是老样子,但我发现每当出问题的时候,总有一个名叫irqbalance的进程CPU占用率居高不下,搜索了一下,发现很多介绍irqbalance的文章中都提及了NUMA,让我一下子想起之前在mongodb日志中看到的警告信息,这个问题或许还真是跟NUMA有关。
mongodb日志中的警告信息:
Fri May 10 16:17:48 [initandlisten] MongoDB starting : pid=18486 port=27017 dbpath=/backup/mongodbData 64-bit host=sqcache02
Fri May 10 16:17:48 [initandlisten]
Fri May 10 16:17:48 [initandlisten] ** WARNING: You are running on a NUMA machine.
Fri May 10 16:17:48 [initandlisten] ** We suggest launching mongod like this to avoid performance problems:
Fri May 10 16:17:48 [initandlisten] ** numactl --interleave=all mongod [other options
那么NUMA到底是什么?我们来了解一下。
转自网络:
一、NUMA和SMP是两种CPU相关的硬件架构。在SMP架构里面,所有的CPU争用一个总线来访问所有内存,优点是资源共享,而缺点是总线争用激烈。随着PC服务器上的CPU数量变多(不仅仅是CPU核数),总线争用的弊端慢慢越来越明显,于是Intel在Nehalem CPU上推出了NUMA架构,而AMD也推出了基于相同架构的Opteron CPU。
NUMA最大的特点是引入了node和distance的概念。对于CPU和内存这两种最宝贵的硬件资源,NUMA用近乎严格的方式划分了所属的资源组(node),而每个资源组内的CPU和内存是几乎相等。资源组的数量取决于物理CPU的个数(现有的PC server大多数有两个物理CPU,每个CPU有4个核);distance这个概念是用来定义各个node之间调用资源的开销,为资源调度优化算法提供数据支持。
NUMA和SMP是两种CPU相关的硬件架构。在SMP架构里面,所有的CPU争用一个总线来访问所有内存,优点是资源共享,而缺点是总线争用激烈。随着PC服务器上的CPU数量变多(不仅仅是CPU核数),,总线争用的弊端慢慢越来越明显,于是Intel在Nehalem CPU上推出了NUMA架构,而AMD也推出了基于相同架构的Opteron CPU。
NUMA最大的特点是引入了node和distance的概念。对于CPU和内存这两种最宝贵的硬件资源,NUMA用近乎严格的方式划分了所属的资源组(node),而每个资源组内的CPU和内存是几乎相等。资源组的数量取决于物理CPU的个数(现有的PC server大多数有两个物理CPU,每个CPU有4个核);distance这个概念是用来定义各个node之间调用资源的开销,为资源调度优化算法提供数据支持。
二、NUMA相关的策略
1、每个进程(或线程)都会从父进程继承NUMA策略,并分配有一个优先node。如果NUMA策略允许的话,进程可以调用其他node上的资源。
2、NUMA的CPU分配策略有cpunodebind、physcpubind。cpunodebind规定进程运行在某几个node之上,而physcpubind可以更加精细地规定运行在哪些核上。
3、NUMA的内存分配策略有localalloc、preferred、membind、interleave。localalloc规定进程从当前node上请求分配内存;而preferred比较宽松地指定了一个推荐的node来获取内存,如果被推荐的node上没有足够内存,进程可以尝试别的node。membind可以指定若干个node,进程只能从这些指定的node上请求分配内存。interleave规定进程从指定的若干个node上以RR算法交织地请求分配内存。
三、NUMA和swap的关系
NUMA的内存分配策略对于进程(或线程)之间来说,并不是公平的。在现有的Redhat Linux中,localalloc是默认的NUMA内存分配策略,这个配置选项导致资源独占程序很容易将某个node的内存用尽。而当某个node的内存耗尽时,Linux又刚好将这个node分配给了某个需要消耗大量内存的进程(或线程),swap就妥妥地产生了。尽管此时还有很多page cache可以释放,甚至还有很多的free内存。
四、NUMA的含义,简单点说,在有多个物理CPU的架构下,NUMA把内存分为本地和远程,每个物理CPU都有属于自己的本地内存,访问本地内存速度快于访问远程内存,缺省情况下,每个物理CPU只能访问属于自己的本地内存。对于MongoDB这种需要大内存的服务来说就可能造成内存不足,进而影响性能。
最后我们看一下,如何解决这个问题:
# echo 0 > /proc/sys/vm/zone_reclaim_mode
在启动时,加参数
# numactl --interleave=all /data/mongodb/bin/mongod -f /app/mongodb.conf --dbpath=/data/db --fork --logpath=/data/logs/mongodb.log
这样登陆mongodb就不会有警告信息了。
(#numactl --interleave=all mongod -f /app/mongodb.conf
forked process: 18523
all output going to: /app/mongodb/log/mongodb.log
child process started successfully, parent exiting)
***** SERVER RESTARTED *****
Fri May 10 16:33:15 [initandlisten] MongoDB starting : pid=18523 port=27017 dbpath=/backup/mongodbData 64-bit host=sqcache02
Fri May 10 16:33:15 [initandlisten] db version v2.2.6, pdfile version 4.5
Fri May 10 16:33:15 [initandlisten] git version: d626379119a6de9f2fb390780cf2fc336dfd540d
Fri May 10 16:33:15 [initandlisten] build info: Linux ip-10-2-29-40 2.6.21.7-2.ec2.v1.2.fc8xen #1 SMP Fri
关于zone_reclaim_mode内核参数的说明,可以参考官方文档。
注:从MongoDB1.9.2开始:MongoDB会在启动时自动设置zone_reclaim_mode,无需手工更改。

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

AI Hentai Generator
Generate AI Hentai for free.

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics

It is recommended to use the latest version of MongoDB (currently 5.0) as it provides the latest features and improvements. When selecting a version, you need to consider functional requirements, compatibility, stability, and community support. For example, the latest version has features such as transactions and aggregation pipeline optimization. Make sure the version is compatible with the application. For production environments, choose the long-term support version. The latest version has more active community support.

Node.js is a server-side JavaScript runtime, while Vue.js is a client-side JavaScript framework for creating interactive user interfaces. Node.js is used for server-side development, such as back-end service API development and data processing, while Vue.js is used for client-side development, such as single-page applications and responsive user interfaces.

With the development of the Internet, people's lives are becoming more and more digital, and the demand for personalization is becoming stronger and stronger. In this era of information explosion, users are often faced with massive amounts of information and have no choice, so the importance of real-time recommendation systems has become increasingly prominent. This article will share the experience of using MongoDB to implement a real-time recommendation system, hoping to provide some inspiration and help to developers. 1. Introduction to MongoDB MongoDB is an open source NoSQL database known for its high performance, easy scalability and flexible data model. Compared to biography

The data of the MongoDB database is stored in the specified data directory, which can be located in the local file system, network file system or cloud storage. The specific location is as follows: Local file system: The default path is Linux/macOS:/data/db, Windows: C:\data\db. Network file system: The path depends on the file system. Cloud Storage: The path is determined by the cloud storage provider.

The MongoDB database is known for its flexibility, scalability, and high performance. Its advantages include: a document data model that allows data to be stored in a flexible and unstructured way. Horizontal scalability to multiple servers via sharding. Query flexibility, supporting complex queries and aggregation operations. Data replication and fault tolerance ensure data redundancy and high availability. JSON support for easy integration with front-end applications. High performance for fast response even when processing large amounts of data. Open source, customizable and free to use.

MongoDB is a document-oriented, distributed database system used to store and manage large amounts of structured and unstructured data. Its core concepts include document storage and distribution, and its main features include dynamic schema, indexing, aggregation, map-reduce and replication. It is widely used in content management systems, e-commerce platforms, social media websites, IoT applications, and mobile application development.

On Linux/macOS: Create the data directory and start the "mongod" service. On Windows: Create the data directory and start the MongoDB service from Service Manager. In Docker: Run the "docker run" command. On other platforms: Please consult the MongoDB documentation. Verification method: Run the "mongo" command to connect and view the server version.

The MongoDB database file is located in the MongoDB data directory, which is /data/db by default, which contains .bson (document data), ns (collection information), journal (write operation records), wiredTiger (data when using the WiredTiger storage engine ) and config (database configuration information) and other files.
