Regarding the database not being suitable to be placed in docker, there are two articles by Waiguoren. One is posted by the original poster, and the other is this article, translated
Still the same point of view:
When the volume is small, you can do it casually. When the volume is large, some things will not work. Traditional databases and docker are not the right way. It is recommended not to containerize them directly. If you must containerize the database, you need the support of various systems, including middleware. systems, containerized systems.
If your database can automatically scale, disaster recovery, switching, come with its own multi-node solution, etc., docker is a better solution.
But if not, don’t use docker.
The original text also makes it very clear:
在 Docker 中水平伸缩只能用于无状态计算服务,而不是数据库。
When the traffic is small, anything can be containerized. Database, application, hadoop, various nodes, nginx.
In the case of large volumes, storage-related services are not suitable for containerization. Stateless services such as the application layer and business layer are suitable for containerization. Memory-intensive services such as caching can be containerized.
To put it simply, there are three issues, disaster recovery, performance and data consistency.
As far as traditional databases like mysql are concerned, there are so many problems that I can list:
How to containerize mysql?
What should I do if the main database mysqld kneels down?
What should I do if the main library dockerd kneels down?
What should I do if the slave mysqld kneels down?
What should I do if the dockerd library crashes?
Can mysql be quickly expanded through containers when the peak is approaching? plan?
Data master-slave switching solution? How to ensure consistency?
The volume is large enough during peak periods, and sometimes the capacity of a physical machine is only enough for one mysql process.
So it’s also a single machine, why can’t I start mysql directly?
Why do we need to put a container outside? How much is the performance loss?
How to upgrade mysql?
Will the data volume lose data? (I have encountered damaged containers many times...)
But mysql is not completely incapable of containerization. Businesses that are not sensitive to data loss (such as products found by JD.com search) can be digitized and use database sharding to increase throughput by increasing the number of instances.
As for the issues mentioned in the original article, some things have flaws, but they are well thought out. For example, the following question is very problematic (about shared data directories):
From the databases I have come into contact with so far, only databases such as cassandra (there are also tidb and cockroachdb, but I have not encountered use cases in large companies so far) are suitable for containerization.
But cassandra itself is also close to being stateless: it provides its own disaster recovery, capacity expansion, and switching solutions.
Let’s mention JD.com.
JD.com is an outlier, but JD.com has also mentioned similar problems and things that need attention.
If there are no problems with a single machine, it is still beneficial in some cases. For example, when my company's Oracle database was adjusting parameters before, the database crashed and could not be started. Fortunately, I used docker at that time, so I ran it directly and changed the data directory. Just point to the original one.
No matter how you create a cluster, it’s not easy to set it up manually using docker or tools like swarm. It’s better to build it directly on a physical machine to save trouble.
There is a company abroad that specializes in docker data storage solutions. For example, flocker, and rancher’s convoy
Docker is more suitable for stateless and will not change the service.
If there are a large number of clusters: Write the docker file. Then when the code is uploaded to the code warehouse, after deploying the docker file, let the release script build the docker service in batches and put the service code into it.
Not only MySQL, but similar to redis and mc are not suitable for putting in docker. In other words, putting the database in docker is just for the sake of using docker, and there is not much benefit.
Yes, because the characteristics of docker determine that it is not suitable for data storage. Not only databases, but also all storage-related services are not suitable for using docker.
Regarding the database not being suitable to be placed in docker, there are two articles by Waiguoren. One is posted by the original poster, and the other is this article, translated
Still the same point of view:
When the volume is small, you can do it casually. When the volume is large, some things will not work. Traditional databases and docker are not the right way. It is recommended not to containerize them directly. If you must containerize the database, you need the support of various systems, including middleware. systems, containerized systems.
If your database can automatically scale, disaster recovery, switching, come with its own multi-node solution, etc., docker is a better solution.
But if not, don’t use docker.
The original text also makes it very clear:
When the traffic is small, anything can be containerized. Database, application, hadoop, various nodes, nginx.
In the case of large volumes, storage-related services are not suitable for containerization. Stateless services such as the application layer and business layer are suitable for containerization. Memory-intensive services such as caching can be containerized.
To put it simply, there are three issues, disaster recovery, performance and data consistency.
As far as traditional databases like mysql are concerned, there are so many problems that I can list:
How to containerize mysql?
What should I do if the main database mysqld kneels down?
What should I do if the main library dockerd kneels down?
What should I do if the slave mysqld kneels down?
What should I do if the dockerd library crashes?
Can mysql be quickly expanded through containers when the peak is approaching? plan?
Data master-slave switching solution? How to ensure consistency?
The volume is large enough during peak periods, and sometimes the capacity of a physical machine is only enough for one mysql process.
So it’s also a single machine, why can’t I start mysql directly?
Why do we need to put a container outside? How much is the performance loss?
How to upgrade mysql?
Will the data volume lose data? (I have encountered damaged containers many times...)
But mysql is not completely incapable of containerization.
Businesses that are not sensitive to data loss (such as products found by JD.com search) can be digitized and use database sharding to increase throughput by increasing the number of instances.
As for the issues mentioned in the original article, some things have flaws, but they are well thought out. For example, the following question is very problematic (about shared data directories):
From the databases I have come into contact with so far, only databases such as cassandra (there are also tidb and cockroachdb, but I have not encountered use cases in large companies so far) are suitable for containerization.
But cassandra itself is also close to being stateless: it provides its own disaster recovery, capacity expansion, and switching solutions.
Let’s mention JD.com.
JD.com is an outlier, but JD.com has also mentioned similar problems and things that need attention.
Even, a lot of customization has been done to docker.
You can watch it on JD.com.
Not suitable, not unable.
If there are no problems with a single machine, it is still beneficial in some cases. For example, when my company's Oracle database was adjusting parameters before, the database crashed and could not be started. Fortunately, I used docker at that time, so I ran it directly and changed the data directory. Just point to the original one.
No matter how you create a cluster, it’s not easy to set it up manually using docker or tools like swarm. It’s better to build it directly on a physical machine to save trouble.
There is a company abroad that specializes in docker data storage solutions. For example, flocker, and rancher’s convoy
Docker is more suitable for stateless and will not change the service.
If there are a large number of clusters:
Write the docker file. Then when the code is uploaded to the code warehouse, after deploying the docker file, let the release script build the docker service in batches and put the service code into it.
Not only MySQL, but similar to redis and mc are not suitable for putting in docker. In other words, putting the database in docker is just for the sake of using docker, and there is not much benefit.
Yes, because the characteristics of docker determine that it is not suitable for data storage. Not only databases, but also all storage-related services are not suitable for using docker.
I can’t even figure out which directory the official mysql image stores data in.