How to use MongoDB as a pure in-memory DB (Redis s
原po The idea There has been a growing interest in using MongoDB as an in-memory database, meaning that the data is not stored on disk at all. This can be super useful for applications like: a write-heavy cache in front of a slower RDBMS s
原po
The idea
There has been a growing interest in using MongoDB as an in-memory database, meaning that the data is not stored on disk at all. This can be super useful for applications like:
- a write-heavy cache in front of a slower RDBMS system
- embedded systems
- PCI compliant systems where no data should be persisted
- unit testing where the database should be light and easily cleaned
That would be really neat indeed if it was possible: one could leverage the advanced querying / indexing capabilities of MongoDB without hitting the disk. As you probably know the disk IO (especially random) is the system bottleneck in 99% of cases, and if you are writing data you cannot avoid hitting the disk.
One sweet design choice of MongoDB is that it uses memory-mapped files to handle access to data files on disk. This means that MongoDB does not know the difference between RAM and disk, it just accesses bytes at offsets in giant arrays representing files and the OS takes care of the rest! It is this design decision that allows MongoDB to run in RAM with no modification.
How it is done
This is all achieved by using a special type of filesystem called tmpfs. Linux will make it appear as a regular FS but it is entirely located in RAM (unless it is larger than RAM in which case it can swap, which can be useful!). I have 32GB RAM on this server, let’s create a 16GB tmpfs:
<code># mkdir /ramdata # mount -t tmpfs -o size=16000M tmpfs /ramdata/ # df Filesystem 1K-blocks Used Available Use% Mounted on /dev/xvde1 5905712 4973924 871792 86% / none 15344936 0 15344936 0% /dev/shm tmpfs 16384000 0 16384000 0% /ramdata</code>
Now let’s start MongoDB with the appropriate settings. smallfiles and noprealloc should be used to reduce the amount of RAM wasted, and will not affect performance since it’s all RAM based. nojournal should be used since it does not make sense to have a journal in this context!
<code>dbpath=/ramdata nojournal = true smallFiles = true noprealloc = true</code>
After starting MongoDB, you will find that it works just fine and the files are as expected in the FS:
<code># mongo MongoDB shell version: 2.3.2 connecting to: test > db.test.insert({a:1}) > db.test.find() { "_id" : ObjectId("51802115eafa5d80b5d2c145"), "a" : 1 } # ls -l /ramdata/ total 65684 -rw-------. 1 root root 16777216 Apr 30 15:52 local.0 -rw-------. 1 root root 16777216 Apr 30 15:52 local.ns -rwxr-xr-x. 1 root root 5 Apr 30 15:52 mongod.lock -rw-------. 1 root root 16777216 Apr 30 15:52 test.0 -rw-------. 1 root root 16777216 Apr 30 15:52 test.ns drwxr-xr-x. 2 root root 40 Apr 30 15:52 _tmp</code>
Now let’s add some data and make sure it behaves properly. We will create a 1KB document and add 4 million of them:
<code>> str = "" > aaa = "aaaaaaaaaa" aaaaaaaaaa > for (var i = 0; i for (var i = 0; i db.foo.stats() { "ns" : "test.foo", "count" : 4000000, "size" : 4544000160, "avgObjSize" : 1136.00004, "storageSize" : 5030768544, "numExtents" : 26, "nindexes" : 1, "lastExtentSize" : 536600560, "paddingFactor" : 1, "systemFlags" : 1, "userFlags" : 0, "totalIndexSize" : 129794000, "indexSizes" : { "_id_" : 129794000 }, "ok" : 1 }</code>
The document average size is 1136 bytes and it takes up about 5GB of storage. The index on _id takes about 130MB. Now we need to verify something very important: is the data duplicated in RAM, existing both within MongoDB and the filesystem? Remember that MongoDB does not buffer any data within its own process, instead data is cached in the FS cache. Let’s drop the FS cache and see what is in RAM:
<code># echo 3 > /proc/sys/vm/drop_caches # free total used free shared buffers cached Mem: 30689876 6292780 24397096 0 1044 5817368 -/+ buffers/cache: 474368 30215508 Swap: 0 0 0</code>
As you can see there is 6.3GB of used RAM of which 5.8GB is in FS cache (buffers). Why is there still 5.8GB of FS cache even after all caches were dropped?? The reason is that Linux is smart and it does not duplicate the pages between tmpfs and its cache… Bingo! That means your data exists with a single copy in RAM. Let’s access all documents and verify RAM usage is unchanged:
<code>> db.foo.find().itcount() 4000000 # free total used free shared buffers cached Mem: 30689876 6327988 24361888 0 1324 5818012 -/+ buffers/cache: 508652 30181224 Swap: 0 0 0 # ls -l /ramdata/ total 5808780 -rw-------. 1 root root 16777216 Apr 30 15:52 local.0 -rw-------. 1 root root 16777216 Apr 30 15:52 local.ns -rwxr-xr-x. 1 root root 5 Apr 30 15:52 mongod.lock -rw-------. 1 root root 16777216 Apr 30 16:00 test.0 -rw-------. 1 root root 33554432 Apr 30 16:00 test.1 -rw-------. 1 root root 536608768 Apr 30 16:02 test.10 -rw-------. 1 root root 536608768 Apr 30 16:03 test.11 -rw-------. 1 root root 536608768 Apr 30 16:03 test.12 -rw-------. 1 root root 536608768 Apr 30 16:04 test.13 -rw-------. 1 root root 536608768 Apr 30 16:04 test.14 -rw-------. 1 root root 67108864 Apr 30 16:00 test.2 -rw-------. 1 root root 134217728 Apr 30 16:00 test.3 -rw-------. 1 root root 268435456 Apr 30 16:00 test.4 -rw-------. 1 root root 536608768 Apr 30 16:01 test.5 -rw-------. 1 root root 536608768 Apr 30 16:01 test.6 -rw-------. 1 root root 536608768 Apr 30 16:04 test.7 -rw-------. 1 root root 536608768 Apr 30 16:03 test.8 -rw-------. 1 root root 536608768 Apr 30 16:02 test.9 -rw-------. 1 root root 16777216 Apr 30 15:52 test.ns drwxr-xr-x. 2 root root 40 Apr 30 16:04 _tmp # df Filesystem 1K-blocks Used Available Use% Mounted on /dev/xvde1 5905712 4973960 871756 86% / none 15344936 0 15344936 0% /dev/shm tmpfs 16384000 5808780 10575220 36% /ramdata</code>
And that verifies it! :)
What about replication?
You probably want to use replication since a server loses its RAM data upon reboot! Using a standard replica set you will get automatic failover and more read capacity. If a server is rebooted MongoDB will automatically rebuild its data by pulling it from another server in the same replica set (resync). This should be fast enough even in cases with a lot of data and indices since all operations are RAM only :)
It is important to remember that write operations get written to a special collection called oplog which resides in the local database and takes 5% of the volume by default. In my case the oplog would take 5% of 16GB which is 800MB. In doubt, it is safer to choose a fixed oplog size using the oplogSize option. If a secondary server is down for a longer time than the oplog contains, it will have to be resynced. To set it to 1GB, use:
<code>oplogSize = 1000</code>
What about sharding?
Now that you have all the querying capabilities of MongoDB, what if you want to implement a large service with it? Well you can use sharding freely to implement a large scalable in-memory store. Still the config servers (that contain the chunk distribution) should be disk based since their activity is small and rebuilding a cluster from scratch is not fun.
What to watch for
RAM is a scarce resource, and in this case you definitely want the entire data set to fit in RAM. Even though tmpfs can resort to swapping the performance would drop dramatically. To make best use of the RAM you should consider:
- usePowerOf2Sizes option to normalize the storage buckets
- run a compact command or resync the node periodically.
- use a schema design that is fairly normalized (avoid large document growth)
Conclusion
Sweet, you can now use MongoDB and all its features as an in-memory RAM-only store! Its performance should be pretty impressive: during the test with a single thread / core I was achieving 20k writes per second, and it should scale linearly over the number of cores.
原文地址:How to use MongoDB as a pure in-memory DB (Redis s, 感谢原作者分享。

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

AI Hentai Generator
Generate AI Hentai for free.

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics

It is recommended to use the latest version of MongoDB (currently 5.0) as it provides the latest features and improvements. When selecting a version, you need to consider functional requirements, compatibility, stability, and community support. For example, the latest version has features such as transactions and aggregation pipeline optimization. Make sure the version is compatible with the application. For production environments, choose the long-term support version. The latest version has more active community support.

Node.js is a server-side JavaScript runtime, while Vue.js is a client-side JavaScript framework for creating interactive user interfaces. Node.js is used for server-side development, such as back-end service API development and data processing, while Vue.js is used for client-side development, such as single-page applications and responsive user interfaces.

With the development of the Internet, people's lives are becoming more and more digital, and the demand for personalization is becoming stronger and stronger. In this era of information explosion, users are often faced with massive amounts of information and have no choice, so the importance of real-time recommendation systems has become increasingly prominent. This article will share the experience of using MongoDB to implement a real-time recommendation system, hoping to provide some inspiration and help to developers. 1. Introduction to MongoDB MongoDB is an open source NoSQL database known for its high performance, easy scalability and flexible data model. Compared to biography

The data of the MongoDB database is stored in the specified data directory, which can be located in the local file system, network file system or cloud storage. The specific location is as follows: Local file system: The default path is Linux/macOS:/data/db, Windows: C:\data\db. Network file system: The path depends on the file system. Cloud Storage: The path is determined by the cloud storage provider.

The MongoDB database is known for its flexibility, scalability, and high performance. Its advantages include: a document data model that allows data to be stored in a flexible and unstructured way. Horizontal scalability to multiple servers via sharding. Query flexibility, supporting complex queries and aggregation operations. Data replication and fault tolerance ensure data redundancy and high availability. JSON support for easy integration with front-end applications. High performance for fast response even when processing large amounts of data. Open source, customizable and free to use.

MongoDB is a document-oriented, distributed database system used to store and manage large amounts of structured and unstructured data. Its core concepts include document storage and distribution, and its main features include dynamic schema, indexing, aggregation, map-reduce and replication. It is widely used in content management systems, e-commerce platforms, social media websites, IoT applications, and mobile application development.

The MongoDB database file is located in the MongoDB data directory, which is /data/db by default, which contains .bson (document data), ns (collection information), journal (write operation records), wiredTiger (data when using the WiredTiger storage engine ) and config (database configuration information) and other files.

On Linux/macOS: Create the data directory and start the "mongod" service. On Windows: Create the data directory and start the MongoDB service from Service Manager. In Docker: Run the "docker run" command. On other platforms: Please consult the MongoDB documentation. Verification method: Run the "mongo" command to connect and view the server version.
