关于高端内存的权威解释-Mysql Tutorial-php.cn

Home

Database

Mysql Tutorial

关于高端内存的权威解释

WBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWB

Jun 07, 2016 pm 03:11 PM

about Memory explain high-end

注：本文是我见到的所有关于高端内存解释的最详细、最清晰的解释，其他帖子寥寥数语写的都是垃圾，保存下来只为方便后来人和我自己，感谢原文作者！原文地址：http://bbs.chinaunix.net/thread-1938084-1-1.html 注：本文提及的物理地址空间可以理解

注：本文是我见到的所有关于高端内存解释的最详细、最清晰的解释，其他帖子寥寥数语写的都是垃圾，保存下来只为方便后来人和我自己，感谢原文作者！

原文地址：http://bbs.chinaunix.net/thread-1938084-1-1.html

注：本文提及的物理地址空间可以理解为就是物理内存，但是在某些情况下，把他们理解为物理内存是不对的。

本文讨论的环境是NON-PAE的i386平台，内核版本2.6.31-14

一.什么是高端内存

linux中内核使用3G-4G的线性地址空间，也就是说总共只有1G的地址空间可以用来映射物理地址空间。但是，如果内存大于1G的情况下呢？是不是超过1G的内存就无法使用了呢？为此内核引入了一个高端内存的概念，把1G的线性地址空间划分为两部分：小于896M物理地址空间的称之为低端内存，这部分内存的物理地址和3G开始的线性地址是一一对应映射的,也就是说内核使用的线性地址空间3G--(3G+896M)和物理地址空间0-896M一一对应；剩下的128M的线性空间用来映射剩下的大于896M的物理地址空间，这也就是我们通常说的高端内存区。

所谓的建立高端内存的映射就是能用一个线性地址来访问高端内存的页。如何理解这句话呢？在开启分页后，我们要访问一个物理内存地址，需要经过MMU的转换，也就是一个32位地址vaddr的高10位用来查找该vaddr所在页目录项，用12-21位来查找页表项，再用0-11位偏移和页的起始物理地址相加得到paddr,再把该paddr放到前端总线上，那么我们就可以访问该vaddr对应的物理内存了。在低端内存中，每一个物理内存页在系统初始化的时候都已经存在这样一个映射了。而高端内存还不存在这样一个映射(页目录项，页表都是空的)，所以我们必须要在系统初始化完后，提供一系列的函数来实现这个功能，这就是所谓的高端内存的映射。那么我们为什么不再系统初始化的时候把所有的内存映射都建立好呢？主要原因是，内核线性地址空间不足以容纳所有的物理地址空间（1G的内核线性地址空间和最多可达4G的物理地址空间），所以才需要预留一部分（128M）的线性地址空间来动态的映射所有的物理地址空间，于是就产生了所谓的高端内存映射。

二.内核如何管理高端内存

上面的图展示了内核如何使用3G-4G的线性地址空间，首先解释下什么是high_memory

在arch/x86/mm/init_32.c里面由如下代码：

#ifdef CONFIG_HIGHMEM

highstart_pfn = highend_pfn = max_pfn;

if (max_pfn > max_low_pfn)

highstart_pfn = max_low_pfn;

e820_register_active_regions(0, 0, highend_pfn);

sparse_memory_present_with_active_regions(0);

printk(KERN_NOTICE "%ldMB HIGHMEM available.\n",

pages_to_mb(highend_pfn - highstart_pfn));

num_physpages = highend_pfn;

high_memory = (void *) __va(highstart_pfn * PAGE_SIZE-1)+1;

#else

e820_register_active_regions(0, 0, max_low_pfn);

sparse_memory_present_with_active_regions(0);

num_physpages = max_low_pfn;

high_memory = (void *) __va(max_low_pfn * PAGE_SIZE - 1)+1;

#endif

high_memory是“具体物理内存的上限对应的虚拟地址”,可以这么理解：当内存内存小于896M时，那么high_memory = (void *)__va(max_low_pfn * PAGE_SIZE），max_low_pfn就是在内存中最后的一个页帧号,所以high_memory=0xc0000000+物理内存大小;当内存大于896M时，那么highstart_pfn= max_low_pfn,此时max_low_pfn就不是物理内存的最后一个页帧号了，而是内存为896M时的最后一个页帧号，那么high_memory=0xc0000000+896M.总之high_memory是不能超过0xc0000000+896M.

由于我们讨论的是物理内存大于896M的情况，所以high_memory实际上就是0xc0000000+896M，从high_memory开始的128M(4G-high_memory)就是用作用来映射剩下的大于896M的内存的，当然这128M还可以用来映射设备的内存(MMIO)。

从上图我们看到有VMALLOC_START,VMALLOC_END,PKMAP_BASE,FIX_ADDRESS_START等宏术语，其实这些术语划分了这128M的线性空间，一共分为三个区域：VMALLOC区域(本文不涉及这部分内容，关注本博客的其他文章)，永久映射区（permanetkernelmappings）, 临时映射区（temporary kernelmappings）.这三个区域都可以用来映射高端内存，本文重点阐述下后两个区域是如何映射高端内存的。

三.永久映射区（permanet kernel mappings）

1.介绍几个定义：

PKMAP_BASE:永久映射区的起始线性地址。

pkmap_page_table：永久映射区对应的页表。

LAST_PKMAP：pkmap_page_table里面包含的entry的数量=1024

pkmap_count[LAST_PKMAP]数组：每一个元素的值对应一个entry的引用计数。关于引用计数的值，有以下几种情况：

0：说明这个entry可用。

1：entry不可用，虽然这个entry没有被用来映射任何内存，但是他仍然存在TLBentry没有被flush,

所以还是不可用。

N：有N-1个对象正在使用这个页面

首先，要知道这个区域的大小是4M,也就是说128M的线性地址空间里面，只有4M的线性地址空间是用来作永久映射区的。至于到底是哪4M，是由PKMAP_BASE决定的，这个变量表示用来作永久内存映射的4M区间的起始线性地址。

在NON-PAE的i386上，页目录里面的每一项都指向一个4M的空间，所以永久映射区只需要一个页目录项就可以了。而一个页目录项指向一张页表，那么永久映射区正好就可以用一张页表来表示了，于是我们就用pkmap_page_table来指向这张页表。

pgd = swapper_pg_dir + pgd_index(vaddr);

pud = pud_offset(pgd, vaddr);//pud==pgd

pmd = pmd_offset(pud, vaddr);//pmd==pud==pgd

pte = pte_offset_kernel(pmd, vaddr);

pkmap_page_table = pte;

2.具体代码分析（2.6.31）

void *kmap(struct page *page)

{

might_sleep();

if (!PageHighMem(page))

return page_address(page);

return kmap_high(page);

}

kmap()函数就是用来建立永久映射的函数：由于调用kmap函数有可能会导致进程阻塞，所以它不能在中断处理函数等不可被阻塞的上下文下被调用，might_sleep()的作用就是当该函数在不可阻塞的上下文下被调用是，打印栈信息。接下来判断该需要建立永久映射的页是否确实属于高端内存，因为我们知道低端内存的每个页都已经存在和线性地址的映射了，所以，就不需要再建立了，page_address()函数返回该page对应的线性地址。（关于page_address()函数，参考本博客的专门文章有解释）。最后调用kmap_high(page),可见kmap_high()才真正执行建立永久映射的操作。

/**

* kmap_high - map a highmem page into memory

* @page: &struct page to map

* Returns the page's virtual memory address.

* We cannot call this from interrupts, as it may block.

void *kmap_high(struct page *page)

{

unsigned long vaddr;

* For highmem pages, we can't trust "virtual" until

* after we have the lock.

lock_kmap();

vaddr = (unsigned long)page_address(page);

if (!vaddr)

vaddr = map_new_virtual(page);

pkmap_count[PKMAP_NR(vaddr)]++;

BUG_ON(pkmap_count[PKMAP_NR(vaddr)] 2);

unlock_kmap();

return (void*) vaddr;

}

kmap_high函数分析：首先获得对pkmap_page_table操作的锁,然后再调用page_address()来返回该page是否已经被映射，我们看到前面在kmap()里面已经判断过了，为什么这里还要再次判断呢?因为再获的锁的时候，有可能锁被其他CPU拿走了，而恰巧其他CPU拿了这个锁之后，也是执行这段code,而且映射的也是同一个page，那么当它把锁释放掉的时候，其实就表示该page的映射已经被建立了，我们这里就没有必要再去执行这段code了，所以就有必要在获得锁后再判断下。

如果发现vaddr不为空，那么就是刚才说的，已经被其他cpu上执行的任务给建立了，这里只需要把表示该页引用计数的pkmap_count[]再加一就可以了。同时调用BUG_ON来确保该引用计数确实是不小于2的，否则就是有问题的了。然后返回vaddr,整个建立就完成了。

如果发现vaddr为空呢？调用map_new_virtual()函数，到此我们看到，其实真正进行建立映射的代码在这个函数里面

static inline unsigned long map_new_virtual(struct page *page)

{

unsigned long vaddr;

int count;

start:

count = LAST_PKMAP;//LAST_PKMAP=1024

/* Find an empty entry */

for (;;) {

last_pkmap_nr = (last_pkmap_nr + 1) & LAST_PKMAP_MASK;

if (!last_pkmap_nr) {

flush_all_zero_pkmaps();

count = LAST_PKMAP;

}

if (!pkmap_count[last_pkmap_nr])

break; /* Found a usable entry */

if (--count)

continue;

* Sleep for somebody else to unmap their entries

{

DECLARE_WAITQUEUE(wait, current);

__set_current_state(TASK_UNINTERRUPTIBLE);

add_wait_queue(&pkmap_map_wait, &wait);

unlock_kmap();

schedule();

remove_wait_queue(&pkmap_map_wait, &wait);

lock_kmap();

/* Somebody else might have mapped it while we slept */

if (page_address(page))

return (unsigned long)page_address(page);

/* Re-start */

goto start;

}

vaddr = PKMAP_ADDR(last_pkmap_nr);

set_pte_at(&init_mm, vaddr,

&(pkmap_page_table[last_pkmap_nr]), mk_pte(page, kmap_prot));

pkmap_count[last_pkmap_nr] = 1;

set_page_address(page, (void *)vaddr);

return vaddr;

}

last_pkmap_nr：记录上次被分配的页表项在pkmap_page_table里的位置，初始值为０，所以第一次分配的时候last_pkmap_nr等于１。

接下来判断什么时候last_pkmap_nr等于０，等于０就表示1023（LAST_PKMAP(1024)-1）个页表项已经被分配了,这时候就需要调用flush_all_zero_pkmaps()函数,把所有pkmap_count[]计数为1的页表项在TLB里面的entry给flush掉，并重置为0，这就表示该页表项又可以用了，可能会有疑惑为什么不在把pkmap_count置为1的时候也就是解除映射的同时把TLB也flush呢？个人感觉有可能是为了效率的问题吧，毕竟等到不够的时候再刷新，效率要好点吧。

再判断pkmap_count[last_pkmap_nr]是否为0，0的话就表示这个页表项是可用的，那么就跳出循环了到下面了。

PKMAP_ADDR(last_pkmap_nr)返回这个页表项对应的线性地址vaddr.

#definePKMAP_ADDR(nr) (PKMAP_BASE + ((nr)

set_pte_at(mm,addr, ptep, pte)函数在NON-PAE i386上的实现其实很简单，其实就等同于下面的代码：

staticinline void native_set_pte(pte_t *ptep , pte_t pte)

{

*ptep = pte;

}

我们已经知道页表的线性起始地址存放在pkmap_page_table里面，那么相应的可用的页表项的地址就是&pkmap_page_table[last_pkmap_nr],得到了页表项的地址，只要把相应的pte填写进去，那么整个映射不就完成了吗？

pte由两部分组成：高20位表示物理地址，低12位表示页的描述信息。

怎么通过page查找对应的物理地址呢（参考page_address()一文）？其实很简单，用(page- mem_map) 再移PAGE_SHIFT位就可以了。

低12位的页描述信息是固定的:kmap_prot=(_PAGE_PRESENT| _PAGE_RW | _PAGE_DIRTY | _PAGE_ACCESSED | _PAGE_GLOBAL).

下面的代码就是做了这些事情：

mk_pte(page,kmap_prot));

#definemk_pte(page, pgprot) pfn_pte(page_to_pfn(page), (pgprot))

#definepage_to_pfn __page_to_pfn

#define__page_to_pfn(page) ((unsigned long)((page) - mem_map) + \

ARCH_PFN_OFFSET)

staticinline pte_t pfn_pte(unsigned long page_nr, pgprot_t pgprot)

{

return __pte(((phys_addr_t)page_nr

massage_pgprot(pgprot));

}

接下来把pkmap_count[last_pkmap_nr]置为1，1不是表示不可用吗，既然映射已经建立好了，应该赋值为2呀，其实这个操作是在他的上层函数kmap_high里面完成的(pkmap_count[PKMAP_NR(vaddr)]++).

到此为止，整个映射就完成了，再把page和对应的线性地址加入到page_address_htable哈希链表里面就可以了（参考page_address一文）。

我们继续看所有的页表项都已经用了的情况下，也就是1024个页表项全已经映射了内存了，如何处理。此时count==0,于是就进入了下面的代码:

* Sleepfor somebody else to unmap their entries

{

DECLARE_WAITQUEUE(wait, current);

__set_current_state(TASK_UNINTERRUPTIBLE);

add_wait_queue(&pkmap_map_wait, &wait);

unlock_kmap();

schedule();

remove_wait_queue(&pkmap_map_wait, &wait);

lock_kmap();

/* Somebody else might have mapped it while we slept */

if (page_address(page))

return (unsignedlong)page_address(page);

/* Re-start */

goto start;

}

这段代码其实很简单，就是把当前任务加入到等待队列pkmap_map_wait，当有其他任务唤醒这个队列时，再继续gotostart，重新整个过程。这里就是上面说的调用kmap函数有可能阻塞的原因。

那么什么时候会唤醒pkmap_map_wait队列呢？当调用kunmap_high函数，来释放掉一个映射的时候。

kunmap_high函数其实页很简单，就是把要释放的页表项的计数减1，如果等于1的时候，表示有可用的页表项了，再唤醒pkmap_map_wait队列

/**

*kunmap_high - map a highmem page into memory

* @page:&struct page to unmap

* IfARCH_NEEDS_KMAP_HIGH_GET is not defined then this may be called

* onlyfrom user context.

voidkunmap_high(struct page *page)

{

unsigned long vaddr;

unsigned long nr;

unsigned long flags;

int need_wakeup;

lock_kmap_any(flags);

vaddr = (unsigned long)page_address(page);

BUG_ON(!vaddr);

nr = PKMAP_NR(vaddr);

* A count must never go down to zero

* without a TLB flush!

need_wakeup = 0;

switch (--pkmap_count[nr]) {//减一

case 0:

BUG();

case 1:

* Avoidan unnecessary wake_up() function call.

* Thecommon case is pkmap_count[] == 1, but

* nowaiters.

* Thetasks queued in the wait-queue are guarded

* by boththe lock in the wait-queue-head and by

* thekmap_lock. As the kmap_lock is held here,

* no needfor the wait-queue-head's lock. Simply

* test ifthe queue is empty.

need_wakeup =waitqueue_active(&pkmap_map_wait);

}

unlock_kmap_any(flags);

/* do wake-up, if needed, race-free outside ofthe spin lock */

if (need_wakeup)

wake_up(&pkmap_map_wait);

}

Statement of this Website

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress images for free

Clothoff.io

AI clothes remover

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)

2 weeks ago By 尊渡假赌尊渡假赌尊渡假赌

Repo: How To Revive Teammates

1 months ago By 尊渡假赌尊渡假赌尊渡假赌

Hello Kitty Island Adventure: How To Get Giant Seeds

4 weeks ago By 尊渡假赌尊渡假赌尊渡假赌

How Long Does It Take To Beat Split Fiction?

3 weeks ago By DDD

R.E.P.O. Save File Location: Where Is It & How to Protect It?

3 weeks ago By DDD

Hot Tools

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Hot Topics

Where is the login entrance for gmail email?

7338

Java Tutorial

1627

CakePHP Tutorial

1352

Laravel Tutorial

1265

PHP Tutorial

1210

Related knowledge

Large memory optimization, what should I do if the computer upgrades to 16g/32g memory speed and there is no change? Jun 18, 2024 pm 06:51 PM

For mechanical hard drives or SATA solid-state drives, you will feel the increase in software running speed. If it is an NVME hard drive, you may not feel it. 1. Import the registry into the desktop and create a new text document, copy and paste the following content, save it as 1.reg, then right-click to merge and restart the computer. WindowsRegistryEditorVersion5.00[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SessionManager\MemoryManagement]"DisablePagingExecutive"=d

How to check memory usage on Xiaomi Mi 14Pro? Mar 18, 2024 pm 02:19 PM

Recently, Xiaomi released a powerful high-end smartphone Xiaomi 14Pro, which not only has a stylish design, but also has internal and external black technology. The phone has top performance and excellent multitasking capabilities, allowing users to enjoy a fast and smooth mobile phone experience. However, performance will also be affected by memory. Many users want to know how to check the memory usage of Xiaomi 14Pro, so let’s take a look. How to check memory usage on Xiaomi Mi 14Pro? Introduction to how to check the memory usage of Xiaomi 14Pro. Open the [Application Management] button in [Settings] of Xiaomi 14Pro phone. To view the list of all installed apps, browse the list and find the app you want to view, click on it to enter the app details page. In the application details page

Is there a big difference between 8g and 16g memory in computers? (Choose 8g or 16g of computer memory) Mar 13, 2024 pm 06:10 PM

When novice users buy a computer, they will be curious about the difference between 8g and 16g computer memory? Should I choose 8g or 16g? In response to this problem, today the editor will explain it to you in detail. Is there a big difference between 8g and 16g of computer memory? 1. For ordinary families or ordinary work, 8G running memory can meet the requirements, so there is not much difference between 8g and 16g during use. 2. When used by game enthusiasts, currently large-scale games basically start at 6g, and 8g is the minimum standard. Currently, when the screen is 2k, higher resolution will not bring higher frame rate performance, so there is no big difference between 8g and 16g. 3. For audio and video editing users, there will be obvious differences between 8g and 16g.

Samsung announced the completion of 16-layer hybrid bonding stacking process technology verification, which is expected to be widely used in HBM4 memory Apr 07, 2024 pm 09:19 PM

According to the report, Samsung Electronics executive Dae Woo Kim said that at the 2024 Korean Microelectronics and Packaging Society Annual Meeting, Samsung Electronics will complete the verification of the 16-layer hybrid bonding HBM memory technology. It is reported that this technology has passed technical verification. The report also stated that this technical verification will lay the foundation for the development of the memory market in the next few years. DaeWooKim said that Samsung Electronics has successfully manufactured a 16-layer stacked HBM3 memory based on hybrid bonding technology. The memory sample works normally. In the future, the 16-layer stacked hybrid bonding technology will be used for mass production of HBM4 memory. ▲Image source TheElec, same as below. Compared with the existing bonding process, hybrid bonding does not need to add bumps between DRAM memory layers, but directly connects the upper and lower layers copper to copper.

Micron: HBM memory consumes 3 times the wafer volume, and production capacity is basically booked for next year Mar 22, 2024 pm 08:16 PM

This site reported on March 21 that Micron held a conference call after releasing its quarterly financial report. At the conference, Micron CEO Sanjay Mehrotra said that compared to traditional memory, HBM consumes significantly more wafers. Micron said that when producing the same capacity at the same node, the current most advanced HBM3E memory consumes three times more wafers than standard DDR5, and it is expected that as performance improves and packaging complexity intensifies, in the future HBM4 This ratio will further increase. Referring to previous reports on this site, this high ratio is partly due to HBM’s low yield rate. HBM memory is stacked with multi-layer DRAM memory TSV connections. A problem with one layer means that the entire

Sources say Samsung Electronics and SK Hynix will commercialize stacked mobile memory after 2026 Sep 03, 2024 pm 02:15 PM

According to news from this website on September 3, Korean media etnews reported yesterday (local time) that Samsung Electronics and SK Hynix’s “HBM-like” stacked structure mobile memory products will be commercialized after 2026. Sources said that the two Korean memory giants regard stacked mobile memory as an important source of future revenue and plan to expand "HBM-like memory" to smartphones, tablets and laptops to provide power for end-side AI. According to previous reports on this site, Samsung Electronics’ product is called LPWide I/O memory, and SK Hynix calls this technology VFO. The two companies have used roughly the same technical route, which is to combine fan-out packaging and vertical channels. Samsung Electronics’ LPWide I/O memory has a bit width of 512

Lexar launches Ares Wings of War DDR5 7600 16GB x2 memory kit: Hynix A-die particles, 1,299 yuan May 07, 2024 am 08:13 AM

According to news from this website on May 6, Lexar launched the Ares Wings of War series DDR57600CL36 overclocking memory. The 16GBx2 set will be available for pre-sale at 0:00 on May 7 with a deposit of 50 yuan, and the price is 1,299 yuan. Lexar Wings of War memory uses Hynix A-die memory chips, supports Intel XMP3.0, and provides the following two overclocking presets: 7600MT/s: CL36-46-46-961.4V8000MT/s: CL38-48-49 -1001.45V In terms of heat dissipation, this memory set is equipped with a 1.8mm thick all-aluminum heat dissipation vest and is equipped with PMIC's exclusive thermal conductive silicone grease pad. The memory uses 8 high-brightness LED beads and supports 13 RGB lighting modes.

Kingbang launches new DDR5 8600 memory, offering CAMM2, LPCAMM2 and regular models to choose from Jun 08, 2024 pm 01:35 PM

According to news from this site on June 7, GEIL launched its latest DDR5 solution at the 2024 Taipei International Computer Show, and provided SO-DIMM, CUDIMM, CSODIMM, CAMM2 and LPCAMM2 versions to choose from. ▲Picture source: Wccftech As shown in the picture, the CAMM2/LPCAMM2 memory exhibited by Jinbang adopts a very compact design, can provide a maximum capacity of 128GB, and a speed of up to 8533MT/s. Some of these products can even be stable on the AMDAM5 platform Overclocked to 9000MT/s without any auxiliary cooling. According to reports, Jinbang’s 2024 Polaris RGBDDR5 series memory can provide up to 8400

See all articles