


Linux kernel memory fragmentation prevention technology: in-depth understanding of memory management
Have you ever encountered various memory problems in Linux systems? Such as memory leaks, memory fragmentation, etc. These problems can be solved by in-depth understanding of Linux kernel memory fragmentation prevention technology.
The way the Linux kernel organizes and manages physical memory is the buddy system, and physical memory fragmentation is one of the weaknesses of the buddy system. In order to prevent and solve the fragmentation problem, the kernel has adopted some practical technologies. These technologies will be discussed here. Make a summary.
1 Consolidate fragments when memory is low
Apply memory pages from buddy. If no suitable page is found, two steps of memory adjustment will be performed, compact and reclaim. The former is to consolidate fragments to obtain larger contiguous memory; the latter is to recycle buffer memory that does not necessarily occupy memory. The focus here is to understand comact. The entire process is roughly as follows:
__alloc_pages_nodemask -> __alloc_pages_slowpath -> __alloc_pages_direct_compact -> try_to_compact_pages -> compact_zone_order -> compact_zone -> isolate_migratepages -> migrate_pages -> release_freepages 并不是所有申请不到内存的场景都会compact,首先要满足order大于0,并且gfp_mask携带__
GFP_FS and __GFP_IO; In addition, the remaining memory of the zone needs to meet certain conditions. The kernel calls it the "fragmentation index". This value is between 0 and 1000. The default fragmentation index can only be used when it is greater than 500. compact, this default value can be adjusted through the proc file exfrag_threshold. The fragmentation index is calculated through the fragmentation_index function:
1. /* 2. \* Index is between 0 and 1000 3. * 4. \* 0 => allocation would fail due to lack of memory 5. \* 1000 => allocation would fail due to fragmentation 6. */ 7. return 1000 - div_u64( (1000+(div_u64(info->free_pages * 1000ULL, requested))), info->free_blocks_total)
During the process of consolidating memory fragments, fragmented pages will only move within this zone, and pages located at low addresses in the zone will be moved to the end of the zone as much as possible. Applying for a new page location is implemented through the compaction_alloc function.
The movement process is divided into synchronous and asynchronous. After the memory application fails, the first compact will use asynchronous, and the subsequent reclaim will use synchronous. The synchronous process only moves the pages that are currently unused, and the asynchronous process will traverse and wait for all MOVABLE pages to be used before moving.
2 Organize pages by mobility
Memory pages are divided into the following three types according to mobility:
UNMOVABLE: The location in the memory is fixed and cannot be moved at will. The memory allocated by the kernel basically belongs to this type;
RECLAIMABLE: Cannot be moved, but can be deleted and recycled. For example, file mapped memory;
MOVABLE: It can be moved at will. User space memory basically belongs to this type.
When applying for memory, according to the mobility, first apply for memory in the free page of the specified type. The free memory of each zone is organized as follows:
1. struct zone { 2. ...... 3. struct free_area free_area[MAX_ORDER]; 4. ...... 5. } 6. 7. struct free_area { 8. struct list_head free_list[MIGRATE_TYPES]; 9. unsigned long nr_free; 10. };
When the memory cannot be requested in the free_area of the specified type, it can be appropriated from the backup type. The allocated memory will be released to the newly specified type list. The kernel calls this process "theft".
The alternate type priority list is defined as follows:
1. static int fallbacks[MIGRATE_TYPES][4] = { 2. [MIGRATE_UNMOVABLE] = { MIGRATE_RECLAIMABLE, MIGRATE_MOVABLE, MIGRATE_RESERVE }, 3. [MIGRATE_RECLAIMABLE] = { MIGRATE_UNMOVABLE, MIGRATE_MOVABLE, MIGRATE_RESERVE }, 4. \#ifdef CONFIG_CMA 5. [MIGRATE_MOVABLE] = { MIGRATE_CMA, MIGRATE_RECLAIMABLE, MIGRATE_UNMOVABLE, MIGRATE_RESERVE }, 6. [MIGRATE_CMA] = { MIGRATE_RESERVE }, /* Never used */ 7. \#else 8. [MIGRATE_MOVABLE] = { MIGRATE_RECLAIMABLE, MIGRATE_UNMOVABLE, MIGRATE_RESERVE }, 9. \#endif 10. [MIGRATE_RESERVE] = { MIGRATE_RESERVE }, /* Never used */ 11. \#ifdef CONFIG_MEMORY_ISOLATION 12. [MIGRATE_ISOLATE] = { MIGRATE_RESERVE }, /* Never used */ 13. \#endif 14. };
It is worth noting that not all scenarios are suitable for organizing pages by mobility. When the memory size is not enough to be allocated to various types, it is not suitable to enable mobility. There is a global variable to indicate whether it is enabled, which is set during memory initialization:
1. void __ref build_all_zonelists(pg_data_t *pgdat, struct zone *zone) 2. { 3. ...... 4. if (vm_total_pages else 7. page_group_by_mobility_disabled = 0; 8. ...... 9. }
If page_group_by_mobility_disabled, all memory is non-movable.
There is a parameter that determines the at least number of pages each memory area has, pageblock_nr_pages, which is defined as follows:
#define pageblock_order HUGETLB_PAGE_ORDER
1. \#else /* CONFIG_HUGETLB_PAGE */ 2. /* If huge pages are not used, group by MAX_ORDER_NR_PAGES */ 3. \#define pageblock_order (MAX_ORDER-1) 4. \#endif /* CONFIG_HUGETLB_PAGE */ 5. \#define pageblock_nr_pages (1UL
During system initialization, all pages are marked MOVABLE:
1. void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone, 2. unsigned long start_pfn, enum memmap_context context) 3. { 4. ...... 5. if ((z->zone_start_pfn
Other mobility types of pages are generated later, which is the "stealing" mentioned above. When this happens, higher priority, larger contiguous pages in the fallback are usually "stealed" to avoid the generation of small fragments.
1. /* Remove an element from the buddy allocator from the fallback list */ 2. static inline struct page * 3. __rmqueue_fallback(struct zone *zone, int order, int start_migratetype) 4. { 5. ...... 6. /* Find the largest possible block of pages in the other list */ 7. for (current_order = MAX_ORDER-1; current_order >= order; 8. --current_order) { 9. for (i = 0;; i++) { 10. migratetype = fallbacks[start_migratetype][i]; 11. ...... 12. }
You can view the page distribution of various types of the current system through /proc/pageteypeinfo.
3 Virtual removable memory domain
Before the technology of organizing pages based on mobility, there is another method that has been integrated into the kernel, which is the virtual memory domain: ZONE_MOVABLE. The basic idea is simple: divide the memory into two parts, removable and non-removable.
1. enum zone_type { 2. \#ifdef CONFIG_ZONE_DMA 3. ZONE_DMA, 4. \#endif 5. \#ifdef CONFIG_ZONE_DMA32 6. ZONE_DMA32, 7. \#endif 8. ZONE_NORMAL, 9. \#ifdef CONFIG_HIGHMEM 10. ZONE_HIGHMEM, 11. \#endif 12. ZONE_MOVABLE, 13. __MAX_NR_ZONES 14. };
ZONE_MOVABLE的启用需要指定kernel参数kernelcore或者movablecore,kernelcore用来指定不可移动的内存数量,movablecore指定可移动的内存大小,如果两个都指定,取不可移动内存数量较大的一个。如果都不指定,则不启动。
与其它内存域不同的是ZONE_MOVABLE不关联任何物理内存范围,该域的内存取自高端内存域或者普通内存域。
find_zone_movable_pfns_for_nodes用来计算每个node中ZONE_MOVABLE的内存数量,采用的内存区域通常是每个node的最高内存域,在函数find_usable_zone_for_movable中体现。
在对每个node分配ZONE_MOVABLE内存时,kernelcore会被平均分配到各个Node:
kernelcore_node = required_kernelcore / usable_nodes;
在kernel alloc page时,如果gfp_flag同时指定了__GFP_HIGHMEM和__GFP_MOVABLE,则会从ZONE_MOVABLE内存域申请内存。
总之,Linux kernel内存碎片防治技术是一个非常重要的概念,可以帮助你更好地理解Linux系统中的内存管理。如果你想了解更多关于这个概念的信息,可以查看本文提供的参考资料。
The above is the detailed content of Linux kernel memory fragmentation prevention technology: in-depth understanding of memory management. For more information, please follow other related articles on the PHP Chinese website!

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics





Docker process viewing method: 1. Docker CLI command: docker ps; 2. Systemd CLI command: systemctl status docker; 3. Docker Compose CLI command: docker-compose ps; 4. Process Explorer (Windows); 5. /proc directory (Linux).

VS Code system requirements: Operating system: Windows 10 and above, macOS 10.12 and above, Linux distribution processor: minimum 1.6 GHz, recommended 2.0 GHz and above memory: minimum 512 MB, recommended 4 GB and above storage space: minimum 250 MB, recommended 1 GB and above other requirements: stable network connection, Xorg/Wayland (Linux)

The reasons for the installation of VS Code extensions may be: network instability, insufficient permissions, system compatibility issues, VS Code version is too old, antivirus software or firewall interference. By checking network connections, permissions, log files, updating VS Code, disabling security software, and restarting VS Code or computers, you can gradually troubleshoot and resolve issues.

VS Code is available on Mac. It has powerful extensions, Git integration, terminal and debugger, and also offers a wealth of setup options. However, for particularly large projects or highly professional development, VS Code may have performance or functional limitations.

VS Code is the full name Visual Studio Code, which is a free and open source cross-platform code editor and development environment developed by Microsoft. It supports a wide range of programming languages and provides syntax highlighting, code automatic completion, code snippets and smart prompts to improve development efficiency. Through a rich extension ecosystem, users can add extensions to specific needs and languages, such as debuggers, code formatting tools, and Git integrations. VS Code also includes an intuitive debugger that helps quickly find and resolve bugs in your code.

Although Notepad cannot run Java code directly, it can be achieved by using other tools: using the command line compiler (javac) to generate a bytecode file (filename.class). Use the Java interpreter (java) to interpret bytecode, execute the code, and output the result.

The main uses of Linux include: 1. Server operating system, 2. Embedded system, 3. Desktop operating system, 4. Development and testing environment. Linux excels in these areas, providing stability, security and efficient development tools.

Visual Studio Code (VSCode) is a cross-platform, open source and free code editor developed by Microsoft. It is known for its lightweight, scalability and support for a wide range of programming languages. To install VSCode, please visit the official website to download and run the installer. When using VSCode, you can create new projects, edit code, debug code, navigate projects, expand VSCode, and manage settings. VSCode is available for Windows, macOS, and Linux, supports multiple programming languages and provides various extensions through Marketplace. Its advantages include lightweight, scalability, extensive language support, rich features and version
