This article brings you issues related to image layering, container layering and the disk space occupied by containers in Docker. I hope it will be helpful to you.
Dokcer cleverly applies the idea of hierarchical reuse when organizing storage content. So we can use this as a case to learn this idea.
A Docker image is divided into many layers during the construction process, and each layer is read-only. Let’s illustrate with the following example:
# syntax=docker/dockerfile:1 FROM ubuntu:18.04 LABEL org.opencontainers.image.authors="org@example.com" COPY . /app RUN make /app RUN rm -r $HOME/.cache CMD python /app/app.py
There will be 4 instructions in this Dockerfile that change the file system and create a new layer. The
FROM
command creates the base layer from the ubuntu:18.04 image. LABEL
The command only modifies the metadata of the image and does not create a new layer. The COPY
command adds the contents of the current directory where this build is executed to the image, and creates a new layer to record the changes. RUN
instruction builds the program and outputs the results to the image, creating a new layer to record the changes. RUN
command deletes the cache directory and creates a new layer to record the changes. The CMD
directive defines the instructions to be run in the container. It only modifies the metadata of the image and does not create a new layer. Here each layer only records the differences from the previous layer. When we create a container, a writable layer is created, also called the container layer. Changes to the contents of running containers are recorded in this layer. The following figure describes this relationship:
The main difference between containers and images is the top-level The write layer is different. All write operations to the container will be recorded in this layer. If the container is deleted, the writable layer will also be deleted, but the image will be retained.
Note: If you want multiple containers to share the same data, you can use Docker Volumes.
Each container has its own writable layer, where all transformations will be stored, so multiple containers can share the same image. The following figure describes this relationship:
Note: There is another detail here. Multiple mirrors may share the same layer, such as two mirrors. If the same layer is found locally when building or pulling, it will not be built or pulled again. Therefore, when calculating the image size, you cannot just sum up the size displayed by the docker images
command. The value may be greater than the actual value.
You can use the docker ps -s
command to see the space occupied by the running container. (partial value). The different contents represented by the two columns:
Other ways in which containers occupy disk space:
The storage drivers in Docker all use this strategy.
CoW strategy can share and copy files with maximum efficiency. If a file exists in a lower layer of the image, then its upper layer (including the writable layer) needs to read the content and can use the file directly. When it needs to be modified, the file is copied to this layer and modified. This minimizes IO and the size of each subsequent layer.
When we use docker pull
to pull the image or use an image that is not available locally to create a container , the image will be hierarchically stored in the local Dockers storage area. In Linux it is usually /var/lib/docker
.
We can go to the /var/lib/docker/<storage-driver></storage-driver>
directory to see that we have pulled the images of each layer. For example, use overlay2
storage driver.
With so many layers, we can use docker image inspect
to see which layers a certain image contains
docker image inspect --format "{{json .RootFS.Layers}}" redis docker image inspect --format "{{json .RootFS.Layers}}" mysql:5.7
Through the above review, we can see that redis and mysql5.7 use the same layer. Sharing the same layer greatly saves the storage image space and also improves the pull. The speed of mirroring.
我们可以通过 docker image history
命令来查看镜像分层情况,以redis为例
docker history redis
注意 :
有些步骤的大小为0,是因为他们只改变了元数据,并不会产生新层,也不会占用额外的空间(除元数据本身)。所以上述redis镜像中包含了5层。
<missing></missing>
步骤,这些步骤可能是以下情况中的一种
当我们启动一个容器的时候,会添加一个可写层在镜像之上,用于存储所有的变化。当对已有文件进行修改的时候采用CoW策略。首先会到各层寻找到该文件,然后复制该文件到可写层,然后进行修改并存储。
这么做能够让我们最大限度地减少I/O操作。
但是,很明显的是当一个容器中的应用需要进行频繁的写操作,那么会造成可写层越来越庞大,此时我们可以通过Volume来帮助我们分担压力。
容器的元数据和日志是单独存放的,一般是存放在 /var/lib/docker/containers
中,我们可以使用 du -sh /var/lib/docker/containers/*
来查看各个容器占用多少。(容器ID其实就是文件夹名称的前12位)。
推荐学习:《docker视频教程》
The above is the detailed content of You can learn the idea of layered reuse with Docker in ten minutes. For more information, please follow other related articles on the PHP Chinese website!