Parse the Buffer source code in java-javaTutorial-php.cn

This article mainly introduces the relevant information on the analysis of Buffer source code in java. Friends in need can refer to

Analysis of Buffer source code in java

Buffer

The class diagram of Buffer is as follows:

In addition to Boolean, other basic data types have corresponding Buffer, but only ByteBuffer can interact with Channel. Only ByteBuffer can generate Direct buffers, and Buffers of other data types can only generate Heap type Buffers. ByteBuffer can generate view Buffers of other data types. If ByteBuffer itself is Direct, each generated view Buffer is also Direct.

The essence of Direct and Heap type Buffer

The first choice is to talk about how the JVM performs IO operations.

JVM needs to complete IO operations through operating system calls. For example, it can complete reading files through read system calls. The prototype of read is: ssize_t read(int fd, void *buf, size_t nbytes), similar to other IO system calls, generally requires a buffer as one of the parameters, and the buffer is required to be continuous.

Buffer is divided into two categories: Direct and Heap. These two types of buffers are explained below.

Heap

Heap type Buffer exists on the JVM heap. The recycling and arrangement of this part of memory are the same as ordinary objects. Buffer objects of the Heap type all contain an array attribute corresponding to a basic data type (for example: final **[] hb), and the array is the underlying buffer of the Heap type Buffer.

But the Heap type Buffer cannot be used as a buffer parameter for direct system calls, mainly for the following two reasons.

The JVM may move the buffer (copy-organize) during GC, and the address of the buffer is not fixed.
When the system is called, the buffer needs to be continuous, but the array may not be continuous (the JVM implementation does not require continuous).

So when using a Heap type Buffer for IO, the JVM needs to generate a temporary Direct type Buffer, then copy the data, and then use the temporary Direct

Buffer is used as a parameter for operating system calls. This results in very low efficiency, mainly for two reasons:

The data needs to be copied from the Heap type Buffer to the temporarily created Direct Buffer.
A large number of Buffer objects may be generated, thereby increasing the frequency of GC. Therefore, during IO operations, optimization can be performed by reusing the Buffer.

Direct

Direct type buffer does not exist on the heap, but is a continuous segment allocated directly by the JVM through malloc. Memory, this part of memory becomes direct memory, and the JVM uses direct memory as a buffer when making IO system calls.

-XX:MaxDirectMemorySize, through this configuration you can set the maximum direct memory size allowed to be allocated (memory allocated by MappedByteBuffer is not affected by this configuration).

The recycling of direct memory is different from the recycling of heap memory. If direct memory is used improperly, it is easy to cause OutOfMemoryError. Java does not provide an explicit method to actively release direct memory. The sun.misc.Unsafe class can perform direct underlying memory operations, and direct memory can be actively released and managed through this class. In the same way, direct memory should also be reused to improve efficiency.

The relationship between MappedByteBuffer and DirectByteBuffer

This is a little bit backwards: By rights MappedByteBuffer should be a subclass of DirectByteBuffer, but to keep the spec clear and simple , and for optimization purposes, it's easier to do it the other way around. This works because DirectByteBuffer is a package-private class. (This paragraph is taken from the source code of MappedByteBuffer)

In fact, MappedByteBuffer belongs to Map the buffer (look at the virtual memory yourself), but DirectByteBuffer only indicates that this part of the memory is a continuous buffer allocated by the JVM in the direct memory area, and is not necessarily mapped. In other words, MappedByteBuffer should be a subclass of DirectByteBuffer, but for convenience and optimization, MappedByteBuffer is used as the parent class of DirectByteBuffer. In addition, although MappedByteBuffer should logically be a subclass of DirectByteBuffer, and the memory GC of MappedByteBuffer is similar to the GC of direct memory (different from the heap GC), the size of the allocated MappedByteBuffer is not affected by the -XX:MaxDirectMemorySize parameter.

MappedByteBuffer encapsulates memory mapped file operations, which means that it can only perform file IO operations. MappedByteBuffer is a mapping buffer generated based on mmap. This part of the buffer is mapped to the corresponding file page and belongs to direct memory in user mode. The mapped buffer can be directly operated through MappedByteBuffer, and this part of the buffer is mapped to the file page. On the system, the operating system completes the writing and writing of files by calling in and out of corresponding memory pages.

MappedByteBuffer

Get MappedByteBuffer through FileChannel.map(MapMode mode, long position, long size). The generation process of MappedByteBuffer is explained below with the source code.

FileChannel.map’s source code:

public MappedByteBuffer map(MapMode mode, long position, long size)
    throws IOException
  {
    ensureOpen();
    if (position < 0L)
      throw new IllegalArgumentException("Negative position");
    if (size < 0L)
      throw new IllegalArgumentException("Negative size");
    if (position + size < 0)
      throw new IllegalArgumentException("Position + size overflow");
    //最大2G
    if (size > Integer.MAX_VALUE)
      throw new IllegalArgumentException("Size exceeds Integer.MAX_VALUE");
    int imode = -1;
    if (mode == MapMode.READ_ONLY)
      imode = MAP_RO;
    else if (mode == MapMode.READ_WRITE)
      imode = MAP_RW;
    else if (mode == MapMode.PRIVATE)
      imode = MAP_PV;
    assert (imode >= 0);
    if ((mode != MapMode.READ_ONLY) && !writable)
      throw new NonWritableChannelException();
    if (!readable)
      throw new NonReadableChannelException();

    long addr = -1;
    int ti = -1;
    try {
      begin();
      ti = threads.add();
      if (!isOpen())
        return null;
      //size()返回实际的文件大小
      //如果实际文件大小不符合，则增大文件的大小，文件的大小被改变，文件增大的部分默认设置为0。
      if (size() < position + size) { // Extend file size
        if (!writable) {
          throw new IOException("Channel not open for writing " +
            "- cannot extend file to required size");
        }
        int rv;
        do {
          //增大文件的大小
          rv = nd.truncate(fd, position + size);
        } while ((rv == IOStatus.INTERRUPTED) && isOpen());
      }
      //如果要求映射的文件大小为0，则不调用操作系统的mmap调用，只是生成一个空间容量为0的DirectByteBuffer
      //并返回
      if (size == 0) {
        addr = 0;
        // a valid file descriptor is not required
        FileDescriptor dummy = new FileDescriptor();
        if ((!writable) || (imode == MAP_RO))
          return Util.newMappedByteBufferR(0, 0, dummy, null);
        else
          return Util.newMappedByteBuffer(0, 0, dummy, null);
      }
      //allocationGranularity的大小在我的系统上是4K
      //页对齐，pagePosition为第多少页
      int pagePosition = (int)(position % allocationGranularity);
      //从页的最开始映射
      long mapPosition = position - pagePosition;
      //因为从页的最开始映射，增大映射空间
      long mapSize = size + pagePosition;
      try {
        // If no exception was thrown from map0, the address is valid
        //native方法，源代码在openjdk/jdk/src/solaris/native/sun/nio/ch/FileChannelImpl.c,
        //参见下面的说明
        addr = map0(imode, mapPosition, mapSize);
      } catch (OutOfMemoryError x) {
        // An OutOfMemoryError may indicate that we&#39;ve exhausted memory
        // so force gc and re-attempt map
        System.gc();
        try {
          Thread.sleep(100);
        } catch (InterruptedException y) {
          Thread.currentThread().interrupt();
        }
        try {
          addr = map0(imode, mapPosition, mapSize);
        } catch (OutOfMemoryError y) {
          // After a second OOME, fail
          throw new IOException("Map failed", y);
        }
      }

      // On Windows, and potentially other platforms, we need an open
      // file descriptor for some mapping operations.
      FileDescriptor mfd;
      try {
        mfd = nd.duplicateForMapping(fd);
      } catch (IOException ioe) {
        unmap0(addr, mapSize);
        throw ioe;
      }

      assert (IOStatus.checkAll(addr));
      assert (addr % allocationGranularity == 0);
      int isize = (int)size;
      Unmapper um = new Unmapper(addr, mapSize, isize, mfd);
      if ((!writable) || (imode == MAP_RO)) {
        return Util.newMappedByteBufferR(isize,
                         addr + pagePosition,
                         mfd,
                         um);
      } else {
        return Util.newMappedByteBuffer(isize,
                        addr + pagePosition,
                        mfd,
                        um);
      }
    } finally {
      threads.remove(ti);
      end(IOStatus.checkAll(addr));
    }
  }

Copy after login

map0’s source code implementation:

JNIEXPORT jlong JNICALL
Java_sun_nio_ch_FileChannelImpl_map0(JNIEnv *env, jobject this,
                   jint prot, jlong off, jlong len)
{
  void *mapAddress = 0;
  jobject fdo = (*env)->GetObjectField(env, this, chan_fd);
  //linux系统调用是通过整型的文件id引用文件的，这里得到文件id
  jint fd = fdval(env, fdo);
  int protections = 0;
  int flags = 0;

  if (prot == sun_nio_ch_FileChannelImpl_MAP_RO) {
    protections = PROT_READ;
    flags = MAP_SHARED;
  } else if (prot == sun_nio_ch_FileChannelImpl_MAP_RW) {
    protections = PROT_WRITE | PROT_READ;
    flags = MAP_SHARED;
  } else if (prot == sun_nio_ch_FileChannelImpl_MAP_PV) {
    protections = PROT_WRITE | PROT_READ;
    flags = MAP_PRIVATE;
  }
  //这里就是操作系统调用了，mmap64是宏定义，实际最后调用的是mmap
  mapAddress = mmap64(
    0,          /* Let OS decide location */
    len,         /* Number of bytes to map */
    protections,     /* File permissions */
    flags,        /* Changes are shared */
    fd,          /* File descriptor of mapped file */
    off);         /* Offset into file */

  if (mapAddress == MAP_FAILED) {
    if (errno == ENOMEM) {
      //如果没有映射成功，直接抛出OutOfMemoryError
      JNU_ThrowOutOfMemoryError(env, "Map failed");
      return IOS_THROWN;
    }
    return handle(env, -1, "Map failed");
  }

  return ((jlong) (unsigned long) mapAddress);
}

Copy after login

Although the zise parameter of FileChannel.map() is long, the maximum size of size is Integer.MAX_VALUE, which means that it can only map a maximum space of 2G. In fact, the MMAP provided by the operating system can allocate larger space, but JAVA is limited to 2G, and Buffers such as ByteBuffer can only allocate a maximum buffer size of 2G.

MappedByteBuffer is a buffer generated through mmap. This part of the buffer is directly created and managed by the operating system. Finally, the JVM allows the operating system to directly release this part of the memory through unmmap.

Haep****Buffer

The following uses ByteBuffer as an example to illustrate the details of the Heap type Buffer.

This type of Buffer can be generated in the following way:

ByteBuffer.allocate(int capacity)
ByteBuffer.wrap(byte[] array) uses the incoming array as the underlying buffer. Changing the array will affect the buffer, and changing the buffer will also affect the array.
ByteBuffer.wrap(byte[] array,int offset, int length)

Use part of the passed array as the underlying buffer , changing the corresponding part of the array will affect the buffer, and changing the buffer will also affect the array.

DirectByteBuffer

DirectByteBuffer can only be generated by ByteBuffer.allocateDirect(int capacity).

ByteBuffer.allocateDirect() source code is as follows:

 public static ByteBuffer allocateDirect(int capacity) {
    return new DirectByteBuffer(capacity);
  }

Copy after login

DirectByteBuffer() source code is as follows:

  DirectByteBuffer(int cap) {          // package-private

    super(-1, 0, cap, cap);
    //直接内存是否要页对齐，我本机测试的不用
    boolean pa = VM.isDirectMemoryPageAligned();
    //页的大小，本机测试的是4K
    int ps = Bits.pageSize();
    //如果页对齐，则size的大小是ps+cap，ps是一页，cap也是从新的一页开始，也就是页对齐了
    long size = Math.max(1L, (long)cap + (pa ? ps : 0));
    //JVM维护所有直接内存的大小，如果已分配的直接内存加上本次要分配的大小超过允许分配的直接内存的最大值会
    //引起GC，否则允许分配并把已分配的直接内存总量加上本次分配的大小。如果GC之后，还是超过所允许的最大值，
    //则throw new OutOfMemoryError("Direct buffer memory");
    Bits.reserveMemory(size, cap);

    long base = 0;
    try {
      //是吧，unsafe可以直接操作底层内存
      base = unsafe.allocateMemory(size);
    } catch (OutOfMemoryError x) {、
      //没有分配成功，把刚刚加上的已分配的直接内存的大小减去。
      Bits.unreserveMemory(size, cap);
      throw x;
    }
    unsafe.setMemory(base, size, (byte) 0);
    if (pa && (base % ps != 0)) {
      // Round up to page boundary
      address = base + ps - (base & (ps - 1));
    } else {
      address = base;
    }
    cleaner = Cleaner.create(this, new Deallocator(base, size, cap));
    att = null;
  }

Copy after login

The source code of unsafe.allocateMemory() is in openjdk/src/openjdk/hotspot/src/share/vm/prims/unsafe.cpp. The specific source code is as follows:

UNSAFE_ENTRY(jlong, Unsafe_AllocateMemory(JNIEnv *env, jobject unsafe, jlong size))
 UnsafeWrapper("Unsafe_AllocateMemory");
 size_t sz = (size_t)size;
 if (sz != (julong)size || size < 0) {
  THROW_0(vmSymbols::java_lang_IllegalArgumentException());
 }
 if (sz == 0) {
  return 0;
 }
 sz = round_to(sz, HeapWordSize);
 //最后调用的是 u_char* ptr = (u_char*)::malloc(size + space_before + space_after)，也就是malloc。
 void* x = os::malloc(sz, mtInternal);
 if (x == NULL) {
  THROW_0(vmSymbols::java_lang_OutOfMemoryError());
 }
 //Copy::fill_to_words((HeapWord*)x, sz / HeapWordSize);
 return addr_to_java(x);
UNSAFE_END

Copy after login

The JVM allocates a continuous buffer through malloc. This part of the buffer can be directly used as a buffer parameter for operating system calls.

The above is the detailed content of Parse the Buffer source code in java. For more information, please follow other related articles on the PHP Chinese website!