Detailed introduction to Java memory model-javaTutorial-php.cn

This Java memory model specifies how the Java virtual machine works with computer memory (RAM). This Java virtual machine is a model of the entire computer, so that this model naturally includes a memory model - also called the Java memory model.

Understanding the Java memory model is important if you want to correctly design concurrent programs. This Java memory model refers to how and when different threads can see the values of shared variables written by other threads, and how to access shared variables synchronously.

The original Java memory model was insufficient, so much so that the Java memory model was improved in Java 1.5. This version of the Java memory model is still used in Java 8.

Internal Java memory model

The Java memory model is used inside the JVM by dividing it into thread stacks and heaps. This diagram looks at the memory model from a logical perspective:

Each thread running in the Java virtual machine has its own thread stack. The thread stack contains information about the methods that this thread has called up to the current point of execution. We will also call this the "call stack". As the thread executes its code, this call stack changes.

This thread stack will also contain all local variables for each method being executed (all methods on the call stack). A thread can only access its own thread stack. Local variables created by one thread are not visible to all other threads. Even if two threads are executing the exact same code, both threads still create their own local variables. Therefore, each thread has its own version of local variables.

All basic types of local variables (boolean, byte, short, char, int, long, float, double) are completely stored in the thread stack and are therefore invisible to other threads. A thread may pass a copy of a primitive type variable to another thread, but it still cannot share a local variable of the primitive type.

This heap contains all objects created in your application, regardless of what thread created the object. This includes object versions of basic types (for example, Byte, Integer, Long, etc.). Regardless of whether an object is created and assigned to a local variable, or a member variable of another object is created, the object is still stored in the heap.

Here is a diagram showing the call stack and local variables stored in the thread stack, and objects stored in the heap:

a The local variable may be a primitive type, in which case it will be completely stored in the thread stack.

A local variable may be an object reference. In this scenario the reference (local variable) is stored in the thread stack, but the object itself is stored in the heap.

An object may contain methods, and these methods contain local variables. These local variables are also stored in the thread stack, even if the object this method belongs to is stored in the heap.

The member variables of an object are stored in the heap along with the object itself. Not only when this member variable is of basic type, but also if it is a reference to an object.

Static class variables are also stored in the heap.

Objects in the heap can be accessed by all threads that have references to this object. When a thread accesses an object, it can also access the object's member variables. If two threads call a method on the same object at the same time, they will access the object's member variables at the same time, but each thread will have their own copy of the local variables.

Here is an illustration based on the above explanation:

Two threads have a set of local variables. One of the local variables (Locale Variable 2) points to a common object in the heap (Object 3). The two threads each have a different reference to the same object. The local variables they reference are stored in the thread stack, but the same object pointed to by these two different references is in the heap.

Note how this shared object (Object 3) refers to Object 2 and Object 4 as member variables (shown by the arrows in the illustration). Through the references of these variables in Object3, both threads can also access Object2 and Object4.

This diagram also shows a local variable pointing to two different objects in the heap. In this scenario, this reference will point to two different objects (object 1 and object 5), not the same object. In theory, two objects can access both object 1 and object 5, if both threads have references to these two objects. But in the diagram each thread only has a reference to these two objects.

So what kind of code will have the memory structure in the picture above? Well, a short answer like the following code:

public class MyRunnable implements Runnable() {

    public void run() {
        methodOne();
    }

    public void methodOne() {
        int localVariable1 = 45;

        MySharedObject localVariable2 =
            MySharedObject.sharedInstance;

        //... do more with local variables.

        methodTwo();
    }

    public void methodTwo() {
        Integer localVariable1 = new Integer(99);

        //... do more with local variable.
    }
}

Copy after login

public class MySharedObject {

    //static variable pointing to instance of MySharedObject

    public static final MySharedObject sharedInstance =
        new MySharedObject();


    //member variables pointing to two objects on the heap

    public Integer object2 = new Integer(22);
    public Integer object4 = new Integer(44);

    public long member1 = 12345;
    public long member1 = 67890;
}

Copy after login

If two threads are executing this run method, then this icon will show the result earlier. The run method calls the methodOne method, and the methodOne method calls the methodTwo method.

The methodOne method declares a local variable of basic type (int type), and a local variable referenced by an object.

When each thread executes the methodOne method, it creates its own copies of localVariable1 and localVariable2 in their respective thread stacks. This localVariable1 will be completely separated from each other and will just survive in their respective thread stacks. One thread cannot see the changes made to localVariable1 by another thread.

Each thread executing the methodOne method will also create their own copy of localVariable2. However, these two different copies of localVariable2 point to the same object in the heap. This code sets localVariable2 to point to a reference to an object through a static variable. There is only one copy of the static variable, and this copy is in the heap. Therefore, both copies in localVariable2 end up pointing to the same instance. This MySharedObject is also stored in the heap. It is equivalent to object 3 in the picture above.

Note that this MySharedObject class also contains two member variables. The member variables themselves are stored in the heap along with the object. These two member variables point to two other Integer objects. These Integer objects are equivalent to Object 2 and Object 4 in the above figure.

Also note how the methodTwo method creates a local variable of localVariable1. This local variable is a reference to an Integer object. This method sets the localVariable1 reference to point to a new Integer instance. This localVariable1 reference will be stored in a copy of each thread in the executing methodTwo method. The two instantiated Integer objects will be stored in the heap, but a new Integer object will be created each time this method is executed, and the two threads executing this method will create separate Integer instances. The Integer objects created inside the methodTwo method are equivalent to object 1 and object 5 in the above figure.

Also note that the two member variables of long type in the MySharedObject class are basic types. Because these variables are member variables, they are still stored in the heap with the object. Only local variables will be stored in the thread stack.

Hardware memory architecture

The current hardware memory architecture is slightly different from the internal Java memory model. It is also important to understand the hardware memory architecture, and it is helpful to understand how the Java memory model works. This section describes the common hardware memory framework, and the following sections describe how the Java memory model works with it.

Here is a simplified diagram of the hardware structure of a modern computer:

Modern computers often have two or more CPUs. Some of these CPUs may have multiple cores. The important point is that computers with two or more CPUs may have more than one thread running at the same time. Each CPU can run one thread at any given time. In your Java application, one thread may be running on each CPU at the same time.

Each CPU contains a series of registers, which is essentially CPU memory. This CPU executes faster on registers than on main memory. That's because the CPU accesses registers faster than accessing main memory.

Each CPU may also have a memory layer for the CPU cache. In fact, most modern CPUs have a cache memory layer of some size. This CPU accesses the cache memory layer much faster than main memory, but not as fast as accessing internal registers. As a result, the access speed of this CPU cache memory is between internal registers and main memory. Some CPUs may have multiple levels of cache (level 1 and level 2), but this is not important to know to understand the Java memory model's interaction with memory. It's important to know that the CPU may have a cache memory layer.

A computer also contains a main memory area (RAM). All CPUs have access to this main memory. This main memory is typically larger than the CPU's cache memory.

As a representative, when the CPU needs to access the main memory, it will read the main memory part into the CPU cache. It might even read parts of the cache into registers and then perform operations there. When the CPU needs to write the result back to main memory, it will flush the value from the internal register to cache memory, and at some point flush the value to main memory.

These values stored in cache memory will be flushed to main memory when the CPU needs to store something else there. This CPU cache may sometimes be written to part of its memory, and sometimes part of its memory may be flushed. She does not need to read and write the entire cache every time. Typically, this cache is updated in smaller blocks of memory called "cache lines". One or more cache lines may be read into cache memory, and one or more cache lines may be flushed into main memory again.

Bridging the gap between Java memory model and hardware memory structure

As already mentioned, Java memory model and hardware memory structure are different. This hardware memory structure does not distinguish between thread stacks and heaps. In hardware, both the thread stack and the heap are located in main memory. Parts of the thread stack and heap may sometimes appear in the CPU cache and internal CPU registers, as shown in the following figure:

When objects and variables Certain problems may occur when data can be stored in various different memory areas in your computer. The two main problems are:

Thread visibility for shared variable updates
When reading Race conditions for fetching, checking, and writing shared variables

These issues will be explained in the following sections.

Visibility of shared objects

If two Or if more threads share an object, without proper use of volatile declarations or synchronization, shared variables updated by one thread may not be visible to other threads.

Imagine that the shared object is initially stored in main memory. A thread running on the CPU reads the shared object into its CPU cache. Here it makes a change to a shared object. As long as the CPU cache is not flushed to main memory, the changed version of this shared object is not visible to threads running on other CPUs. This way each thread might end up with their own copy of the shared object, each copy located in a different CPU cache.

The diagram below illustrates the schematic situation. A thread running on the left CPU copies the shared variable into the CPU cache and changes its value to 2. This change is not visible to other threads running on the right CPU because the update to count has not yet been flushed back to main memory.

To solve this problem, you can use Java's volatile keyword. This keyword ensures that a given variable is read directly from main memory and written directly to main memory when updated.

Race condition

If two or more threads share an object, and more than one thread updates the variables in the shared object , race conditions may occur.

Imagine if thread A reads the count variable of a shared object into its CPU cache. Meanwhile, Thread B does the same thing, but goes into a different CPU cache. Now thread increments count by one, and thread B does the same thing. Now the variable is incremented twice.

If these increments are performed sequentially, the count variable will be incremented twice and 2 will be written to main memory based on the original value.

Then, these two increments are not synchronized correctly, resulting in concurrent execution. Regardless of whether thread A or thread B writes their update to main memory, the value of this update is only increased by 1, not increased by 2.

This diagram shows the problem with the race condition described above:

To solve this problem, you can use Java synchronization locks. A synchronization lock can ensure that only one thread can enter the critical area of the code at any time. The synchronization lock will also ensure that all variable accesses will be read from main memory, and when the thread leaves the synchronized code block, all updated variables will be flushed back to main memory again, regardless of whether the variable is declared volatile.

The above is a detailed introduction to the Java memory model. For more related content, please pay attention to the PHP Chinese website (www.php.cn)!