This is a very complicated issue, let me share my understanding.
First of all, you need to know that processes use virtual address spaces. Each process has an independent, complete 4GB (under 32bit) address space. Not every piece of memory may be mapped to physical memory. This mapping work is an operation The system is completed. If you access unmapped memory in the address space, or write to a read-only area, the operating system will report an error (Segment Fault error, haha~) and terminate your program.
When a process starts running, it will apply for a "heap" memory from the operating system. The program itself manages this heap memory. Malloc allocates memory from this memory. In C language, this application and The work of managing heap memory is automatically completed by the runtime library.
When a piece of memory is freed, free (that is, the runtime library) will mark the memory as unused, and the memory may be allocated next time. But this memory can still be read and written by the process, because the runtime library has applied to the operating system to manage this memory by itself.
Local variables are allocated on the stack. When the process starts running, the operating system will allocate a fixed size (usually 1MB) stack to the process. The so-called allocation and release of local variables are just moving the top pointer of the stack. As long as the 1MB memory area of the stack is not exceeded, it can still be read and written.
The above are typical behaviors of common operating systems. They may not work this way on some operating systems and platforms. In short, using freed memory is a very dangerous behavior.
If you are interested, you can read these two books "In-depth Understanding of Computer Systems" and "Link Loading and Libraries"
When I was in school more than ten years ago, computers were still in short supply. Before we bought personal computers, we usually went to the school’s computer room to use computers. The computer room is managed by dedicated personnel and has many rules, including "Do not modify system configuration files at will", "Do not do things unrelated to study" and so on. And we often happily change autoexec.bat/config.sys first (very useless, right? It will be dozens of Kbytes soon, :)), and then copy Jin Yongqun to the machine. Play Legend of Legends and the like secretly for half an hour. When I got on the computer the next day, I went straight to the directory where I saved the files yesterday. When I'm lucky, the file is still there, so I'm overjoyed and continue playing as the progress progresses; when I'm not lucky, not only the file has been deleted, but I also find that the new virus on the machine is very powerful, ->_->.
Maybe you already understand what I mean when it comes to this old sesame and rotten millet.
Actually, what I want to say is two points:
1. You can not follow the rules, but that doesn’t mean there are no rules.
2. The consequences of not following the rules are unpredictable (undefined).
The author did not clearly explain why he thinks that free memory is still available, and why he thinks that the data of local variables is still available.
To illustrate the problem, I assume the following program:
#include<stdio.h>
#include <stdlib.h>
int *lvret(void) {
int ret = 5;
return &ret;
}
int main(void) {
int *p = lvret();
printf("%d\n",*p);
}
The result of compiling and running this program is likely to be 5.
So after the function ends, the data of the local variables is still available, right?
Consider again the following program:
#include<stdio.h>
#include <stdlib.h>
int *lvret(void) {
int ret = 5;
return &ret;
}
void mod(void) {
int a = 7;
}
int main(void) {
int *p = lvret();
mod();
printf("%d\n",*p);
}
The result of running this program is likely to be 7.
Apparently, this memory address is now less reliable.
(The compiler will still give warnings, such as gcc 4.8)
warning: function returns address of local variable [-Wreturn-local-addr]
return &ret;
So this example tells us that the memory space you can access is not always safe.
In other words, you find that the released memory data or the memory occupied by local variables can still be read and written, but it is just an accident - it happens to have not been touched by other programs.
C language is not a memory-safe language.
Neither is C , but C 11 is much better (adopting smart pointers).
Supplement
The above mainly explains why the operation described by the poster fundamentally needs to be avoided.
Let me add some additional answers to the original poster’s questions:
Question 1:
It is the management of Heap.
The subtext of the original poster should be "Since the memory is released, why doesn't segmentation fault appear when accessing it?"
The answer is - this is a problem at the C runtime implementation level. Most runtime library implementations do not attempt to identify memory blocks that have been "freed" and return them to the system (the so-called return to the system is to unmap the process address space). Therefore, when accessing these addresses, segmentation fault does not appear as expected.
But this is not always the case, there are exceptions, such as OpenBSD. Visit the wiki and you can see the following description:
On a call to free, memory is released and unmapped from the process address space using munmap. This system is designed to improve security by taking advantage of the address space layout randomization and gap page features implemented as part of OpenBSD's mmap system call , and to detect use-after-free bugs—as a large memory allocation is completely unmapped after it is freed, further use causes a segmentation fault and termination of the program.
This also proves from the side that the phenomenon observed by the poster is unreliable.
(As for why most runtime libraries adopt such a memory management strategy, that is another topic)
Question 2:
It is the management of Stack.
@ Elite Prince has explained it.
Memory management has the following levels (from high to low): C program - C library (malloc) - operating system - physical memory
First of all, the operating system ensures that each process has an independent virtual memory space (it should be 4G on 32bit, and ordinary processes do not use this much). Of course, physical memory is actually shared by all processes, so when you need dynamic memory, you need to apply to the operating system. At this time, although from the perspective of your program, the memory is continuous, it is actually mapped to a certain piece of physical memory by the operating system. That’s all. After the program uses up the memory and returns it, the actual returned part may be allocated to other processes by the operating system.
It should be noted that the "return" mentioned above is the behavior of the malloc library. The malloc library will use some strategies to improve the efficiency of memory usage. For example, when a program needs to use 10K of memory, malloc may actually apply for 1M because a system call is very expensive; for example, even if you call 🎜> "Returns" the memory used by the program, and the malloc library free may not actually return the memory to the operating system, because the program may apply for dynamic memory again in the future.
The malloc library has multiple implementations. The one I know is to use tags to store meta-information of the memory. For example, if you apply for 8 bytes and get the head pointer address 0x1001 (the actual memory is 0x1001-0x1008), malloc will save 8, which is the length of this memory, at 0x1000 (which is the position of the head pointer -1). When releasing, the program passes the head pointer address to
, and the malloc library finds the length of memory that needs to be released at the position of the head pointer -1, and releases the memory (the actual operation may just clear the tag). This explains: 1. Why, unlike free, the parameter of malloc only has a head pointer and does not require a length; 2. After free, the memory may not actually be returned to the operating system. free
So, accessing memory released (by the program) is an undefined behavior, which means that the result is uncertain. When the malloc library does not return this memory to the operating system or perform the next dynamic allocation, this memory actually still belongs to the program. When the malloc library does not clean up the returned memory (this is the case in most implementations), the value you can access is still the original value. This is the same as the function call is completed without clearing the stack frame, and subsequent calls to the function can access the local variable values that have been set before.
However, when the malloc library has returned the memory to the system, and then accesses the original address (let alone writing), since this address no longer belongs to the program, a classic segmentation fault will occur.
In the final analysis, these phenomena are the result of compromises made by the C language library for more efficient implementation.
ls said it in detail. When releasing memory, it is generally recommended to set the pointer to empty, so as to avoid using the released memory. Such as:
free() and malloc() use some data structures (mainly linked lists) to manage the memory allocated from the heap, but they are just library functions. The system call that actually changes the heap boundary is sbrk(). When malloc() finds that the current memory is not enough to allocate, it will first call sbrk() to expand the heap boundary.
The memory released by free() actually marks the part in the data structure as unused, but it is actually still in the heap, so it can still be used. When a large amount of heap memory is unused, most implementations shrink the bounds, and the freed memory becomes unusable.
The memory management mechanism of Glibc under Linux is roughly as follows:
From the perspective of the operating system, the memory allocation of a process is completed by two system calls: brk and mmap. brk is to push the highest address pointer edata of the data segment (.data) to a higher address, and mmap is to find a free space in the virtual address space of the process. Among them, the memory allocated by mmap is released by munmap, and will be returned to the operating system immediately when the memory is released; while the memory allocated by brk needs to wait until the high address memory is released before it can be released. That is to say, if you apply for two pieces of memory A and B through brk, it is impossible to release A before B is released, and it is still occupied by the process. Check the suspected "memory leak" through TOP. By default, memory allocations greater than or equal to 128KB will call mmap/mummap, and memory requests less than 128KB will call sbrk (can be adjusted by setting MMMAP_THRESHOLD).
Go check this out, there is a quick navigation subtitle, just read section 9.9. After reading this, you will know how malloc manages virtual memory, god bless you.
This is a very complicated issue, let me share my understanding.
First of all, you need to know that processes use virtual address spaces. Each process has an independent, complete 4GB (under 32bit) address space. Not every piece of memory may be mapped to physical memory. This mapping work is an operation The system is completed. If you access unmapped memory in the address space, or write to a read-only area, the operating system will report an error (Segment Fault error, haha~) and terminate your program.
When a process starts running, it will apply for a "heap" memory from the operating system. The program itself manages this heap memory. Malloc allocates memory from this memory. In C language, this application and The work of managing heap memory is automatically completed by the runtime library.
When a piece of memory is freed, free (that is, the runtime library) will mark the memory as unused, and the memory may be allocated next time. But this memory can still be read and written by the process, because the runtime library has applied to the operating system to manage this memory by itself.
Local variables are allocated on the stack. When the process starts running, the operating system will allocate a fixed size (usually 1MB) stack to the process. The so-called allocation and release of local variables are just moving the top pointer of the stack. As long as the 1MB memory area of the stack is not exceeded, it can still be read and written.
The above are typical behaviors of common operating systems. They may not work this way on some operating systems and platforms. In short, using freed memory is a very dangerous behavior.
If you are interested, you can read these two books "In-depth Understanding of Computer Systems" and "Link Loading and Libraries"
When I was in school more than ten years ago, computers were still in short supply. Before we bought personal computers, we usually went to the school’s computer room to use computers. The computer room is managed by dedicated personnel and has many rules, including "Do not modify system configuration files at will", "Do not do things unrelated to study" and so on. And we often happily change autoexec.bat/config.sys first (very useless, right? It will be dozens of Kbytes soon, :)), and then copy Jin Yongqun to the machine. Play Legend of Legends and the like secretly for half an hour. When I got on the computer the next day, I went straight to the directory where I saved the files yesterday. When I'm lucky, the file is still there, so I'm overjoyed and continue playing as the progress progresses; when I'm not lucky, not only the file has been deleted, but I also find that the new virus on the machine is very powerful, ->_->.
Maybe you already understand what I mean when it comes to this old sesame and rotten millet.
Actually, what I want to say is two points:
1. You can not follow the rules, but that doesn’t mean there are no rules.
2. The consequences of not following the rules are unpredictable (undefined).
The author did not clearly explain why he thinks that free memory is still available, and why he thinks that the data of local variables is still available.
To illustrate the problem, I assume the following program:
The result of compiling and running this program is likely to be 5.
So after the function ends, the data of the local variables is still available, right?
Consider again the following program:
The result of running this program is likely to be 7.
Apparently, this memory address is now less reliable.
(The compiler will still give warnings, such as gcc 4.8)
So this example tells us that the memory space you can access is not always safe.
In other words, you find that the released memory data or the memory occupied by local variables can still be read and written, but it is just an accident - it happens to have not been touched by other programs.
C language is not a memory-safe language.
Neither is C , but C 11 is much better (adopting smart pointers).
Supplement
The above mainly explains why the operation described by the poster fundamentally needs to be avoided.
Let me add some additional answers to the original poster’s questions:
Question 1:
It is the management of Heap.
The subtext of the original poster should be "Since the memory is released, why doesn't
segmentation fault
appear when accessing it?" The answer is - this is a problem at the C runtime implementation level. Most runtime library implementations do not attempt to identify memory blocks that have been "freed" and return them to the system (the so-called return to the system is to unmap the process address space). Therefore, when accessing these addresses,segmentation fault
does not appear as expected. But this is not always the case, there are exceptions, such as OpenBSD. Visit the wiki and you can see the following description:This also proves from the side that the phenomenon observed by the poster is unreliable.
(As for why most runtime libraries adopt such a memory management strategy, that is another topic)
Question 2:
It is the management of Stack.
@ Elite Prince has explained it.
Memory management has the following levels (from high to low): C program - C library (malloc) - operating system - physical memory
First of all, the operating system ensures that each process has an independent virtual memory space (it should be 4G on 32bit, and ordinary processes do not use this much). Of course, physical memory is actually shared by all processes, so when you need dynamic memory, you need to apply to the operating system. At this time, although from the perspective of your program, the memory is continuous, it is actually mapped to a certain piece of physical memory by the operating system. That’s all. After the program uses up the memory and returns it, the actual returned part may be allocated to other processes by the operating system.
It should be noted that the "return" mentioned above is the behavior of the malloc library. The malloc library will use some strategies to improve the efficiency of memory usage. For example, when a program needs to use 10K of memory, malloc may actually apply for 1M because a system call is very expensive; for example, even if you call 🎜> "Returns" the memory used by the program, and the malloc library
The malloc library has multiple implementations. The one I know is to use tags to store meta-information of the memory. For example, if you apply for 8 bytes and get the head pointer address 0x1001 (the actual memory is 0x1001-0x1008), malloc will save 8, which is the length of this memory, at 0x1000 (which is the position of the head pointer -1). When releasing, the program passes the head pointer address tofree
may not actually return the memory to the operating system, because the program may apply for dynamic memory again in the future., and the malloc library finds the length of memory that needs to be released at the position of the head pointer -1, and releases the memory (the actual operation may just clear the tag). This explains: 1. Why, unlike
So, accessing memory released (by the program) is an undefined behavior, which means that the result is uncertain. When the malloc library does not return this memory to the operating system or perform the next dynamic allocation, this memory actually still belongs to the program. When the malloc library does not clean up the returned memory (this is the case in most implementations), the value you can access is still the original value. This is the same as the function call is completed without clearing the stack frame, and subsequent calls to the function can access the local variable values that have been set before.free
, the parameter ofmalloc
only has a head pointer and does not require a length; 2. Afterfree
, the memory may not actually be returned to the operating system.free
ls said it in detail. When releasing memory, it is generally recommended to set the pointer to empty, so as to avoid using the released memory. Such as:
free() and malloc() use some data structures (mainly linked lists) to manage the memory allocated from the heap, but they are just library functions. The system call that actually changes the heap boundary is sbrk(). When malloc() finds that the current memory is not enough to allocate, it will first call sbrk() to expand the heap boundary.
The memory released by free() actually marks the part in the data structure as unused, but it is actually still in the heap, so it can still be used. When a large amount of heap memory is unused, most implementations shrink the bounds, and the freed memory becomes unusable.
The memory management mechanism of Glibc under Linux is roughly as follows:
From the perspective of the operating system, the memory allocation of a process is completed by two system calls: brk and mmap. brk is to push the highest address pointer edata of the data segment (.data) to a higher address, and mmap is to find a free space in the virtual address space of the process. Among them, the memory allocated by mmap is released by munmap, and will be returned to the operating system immediately when the memory is released; while the memory allocated by brk needs to wait until the high address memory is released before it can be released. That is to say, if you apply for two pieces of memory A and B through brk, it is impossible to release A before B is released, and it is still occupied by the process. Check the suspected "memory leak" through TOP. By default, memory allocations greater than or equal to 128KB will call mmap/mummap, and memory requests less than 128KB will call sbrk (can be adjusted by setting MMMAP_THRESHOLD).
Reprinted from: http://www.nosqlnotes.net/archives/105
And: http://bbs.csdn.net/topics/330179712
http://blog.csdn.net/cinmyheart/article/details/38136375
Go check this out, there is a quick navigation subtitle, just read section 9.9. After reading this, you will know how malloc manages virtual memory, god bless you.