RHCS principle and operation_PHP tutorial-PHP Tutorial-php.cn

RHCS principle and operation

RHCS component introduction:
1.1 Distributed Cluster Manager (CMAN)
Cluster Manager, referred to as CMAN, is a distributed cluster management tool. Runs on each node of the cluster to provide cluster management tasks for RHCS. CMAN is used to manage cluster members, messages and notifications. It understands the relationship between node members by monitoring the running status of each node. When a node in the cluster fails, the node membership relationship will change. CMAN will promptly notify the bottom layer of this change and make corresponding adjustments.
1.2 Lock Management (DLM)
Distributed Lock Manager, referred to as DLM, represents a distributed lock manager. It is an underlying basic component of RHCS and also provides a common lock operation mechanism for the cluster. In the RHCS cluster system, DLM runs on each node of the cluster, and GFS synchronously accesses file system metadata through the lock mechanism of the lock manager. CLVM uses the lock manager to synchronize updated data to LVM volumes and volume groups. DLM does not need to set up a lock management server. It adopts a peer-to-peer lock management method, which greatly improves processing performance. At the same time, DLM avoids the performance bottleneck that requires overall recovery when a single node fails. In addition, DLM requests are local and do not require network requests, so the requests will take effect immediately. Finally, DLM can implement parallel lock modes for multiple lock spaces through a layered mechanism.
1.3 Configuration File Management (CCS)
Cluster Configuration System, referred to as CCS, is mainly used for cluster configuration file management and synchronization of configuration files between nodes. CCS runs on each node of the cluster and monitors the status of a single configuration file /etc/cluster/cluster.conf on each cluster node. When any changes occur to this file, this change is updated to every node in the cluster. Nodes, keep the configuration files of each node synchronized at all times. For example, the administrator updates the cluster configuration file on node A. After CCS discovers that the configuration file of node A has changed, it immediately propagates the change to other nodes. The configuration file of RHCS is cluster.conf, which is an xml file that specifically includes the cluster name, cluster node information, cluster resource and service information, fence equipment, etc.
1.4 Gate Equipment (FENCE)
FENCE equipment is an essential component of the RHCS cluster. FENCE equipment can avoid the "split brain" phenomenon caused by unpredictable situations. The emergence of FENCE equipment , in order to solve problems like these, the fence device mainly issues hardware management instructions directly to the server or storage through the hardware management interface of the server or storage itself, or an external power management device, restarts or shuts down the server, or disconnects from the network connect. The working principle of FENCE is: when unexpected reasons cause the host to be abnormal or down, the backup machine first calls the FENCE device, and then uses the FENCE device to restart the abnormal host or isolate it from the network. When the FENCE operation is successfully executed, information is returned to the backup machine. After receiving the FENCE success message, the standby machine begins to take over the services and resources of the host machine. In this way, through the FENCE device, the resources occupied by the abnormal node are released, ensuring that resources and services are always running on one node. RHCS's FENCE equipment can be divided into two types: internal FENCE and external FENCE. Commonly used internal FENCE devices include IBM RSAII cards, HP's iLO cards, and IPMI devices, etc. External fence devices include UPS, SAN SWITCH, NETWORK SWITCH, etc.
RHCS operation:
1. Start the RHCS cluster
The core processes of the RHCS cluster are cman and rgmanager. To start the cluster, execute the following commands on each node of the cluster in sequence:
service cman start
service rgmanager start
It should be noted that the execution of these two commands is sequential. You need to start cman first, and then start rgmanager. After the cman service is successfully started on all nodes in the cluster, continue to start the rgmanager service on each node in sequence.

2. Shut down the RHCS cluster
Just the opposite of starting the cluster service. The command to shut down the RHCS cluster is:
service rgmanager stop
service cman stop
First, start each node in the cluster Close the rgmanager service in sequence, wait until the rgmanager service of all nodes is successfully closed, and then close the cman service of each node in sequence to complete the shutdown of the entire cluster service.
Sometimes when closing the cman service, it may prompt that the shutdown failed. At this time, you can check whether the shared storage GFS2 file system of the local machine has been uninstalled, and you can also check whether the rgmanager service of other nodes has been shut down normally.

3. Management of application services
After the cluster system is started, the application service is automatically started by default. However, if an application service does not start automatically, it needs to be started manually. The command to manage application services is clusvcadm. Through this command, you can start, shut down, restart, and switch application services in the cluster.
1. Start an application service
You can start the application service of a node in the following way:
clusvcadm -e -m
Where:
? Service: represents the application created in the cluster Service name.
? Node: Indicates the name of the cluster node.
For example, to start the webserver service on node web1, the operation is as follows:
[root@web1 ~]# clusvcadm -e webserver -m web1
Member web1 trying to enable service:webserver...Success
service:webserver is now running on web1
You can view the details of starting the application service through the /var/log/messages file. When the webserver is started, service-related cluster resources such as virtual IP and application service scripts are also started. You can use the following command to check whether the cluster resources have been loaded normally:
2. Close an application service
You can shut down the application service of a node in the following way:
clusvcadm -s -m
For example, to shut down the mysqlserver service on node Mysql1, do the following:
[root@Mysql1 ~]# clusvcadm -s mysqlserver -m Mysql1
Member Mysql1 stopping service:mysqlserver...Success
You can view detailed information about shutting down application services through the /var/log/messages file. When mysqlserver is closed, cluster resources related to the service: such as virtual IP and application service scripts are also released.
3. Restart an application service
You can restart the application service of a node as follows:
clusvcadm -R -m
For example, to restart the webserver service on node web1, do the following:
[root@web2 ~]# clusvcadm -R webserver -m web1
Member web1 trying to restart service:webserver...Success
This command is executed on the web2 node, but it can also be successfully Restart the webserver on the web1 node. It can be seen that the clusvcadm command can be executed on any node in the cluster.
4. Switch a service
You can switch an application service from one node to another node in the following way:
clusvcadm -r -m

For example, to switch the node web1 The service is switched to node web2. The operation is as follows:
[root@web1 ~]# clusvcadm -r webserver -m web2
Trying to relocate service:webserver to web2...Success
service:webserver is now running on web2

4. Monitoring the RHCS cluster status
By monitoring RHCS, it helps to understand the health status of each node in the cluster, discover problems, and solve them in a timely manner. The RHCS cluster provides a wealth of status viewing command. Here we mainly introduce how to use cman_tool, clustat, and ccs_tool.
1. cman_tool command
cman_tool has many parameters, but its usage is relatively simple. The basic usage format is:
cman_tool [options]
Here are a few simple usage examples:
[root @web1 ~]# cman_tool nodes -a
Node Sts Inc Joined Name
0 M 0 2010-08-23 01:24:00 /dev/sdb7
1 M 2492 2010-08-23 01: 22:43 web2
Addresses: 192.168.12.240
2 M 2492 2010-08-23 01:22:43 Mysql1
Addresses: 192.168.12.231
3 M 2492 2010-08-23 01 : 22:43 Mysql2
Addresses: 192.168.12.232
4 M 2488 2010-08-23 01:22:43 web1
Addresses: 192.168.12.230
This command displays the node name and the corresponding The node IP address and the time it joined the cluster.
If you want to know more cluster node information, you can use the following command:
[root@web1 ~]# cman_tool status
Version: 6.2.0
Config Version: 35 #Cluster configuration file version number
Cluster Name: mycluster #Cluster name
Cluster Id: 56756
Cluster Member: Yes
Cluster Generation: 2764
Membership state: Cluster-Member
Nodes: 4 #Number of cluster nodes
Expected votes: 6 #Expected number of votes
Quorum device votes: 2 #Voting disk voting value
Total votes: 6 #The size of all voting values in the cluster
Quorum: 4 #Cluster legal voting value , below this value, the cluster will stop serving
Active subsystems: 9
Flags: Dirty
Ports Bound: 0 177
Node name: web1
Node ID: 4 #This node is in the cluster The ID number in
Multicast addresses: 239.192.221.146 #Cluster broadcast address
Node addresses: 192.168.12.230 #The IP address corresponding to this node
2. clustat command
clustat command is very simple to use and detailed You can get help information through "clustat -h". Here are just a few examples.

[root@web1 ~]#clustat -i 3
Cluster Status for mycluster @ Mon Aug 23 18:54:15 2010
Member Status: Quorate
Member Name ID Status
------ ---- ---- ------
web2 1 Online, rgmanager
Mysql1 2 Online, rgmanager
Mysql2 3 Online, rgmanager
web1 4 Online, Local, rgmanager
/dev/sdb7 0 Online, Quorum Disk
Service Name Owner (Last) State
------- ---- ----- ---- -- --------
service:mysqlserver Mysql1 started
service:webserver web1 started
The meaning of the output content is as follows:
The "-i" parameter of clustertat can display the cluster in real time The running status of each node and service in the system. "-i 3" means the cluster status is refreshed every three seconds.
In this output, you can see that each node is in the "Online" state, indicating that each node is running normally. If a node exits the cluster, the corresponding status should be "Offline", and you can also see Yes, the two services of the cluster are also in the "started" state, running on the Mysql1 node and web1 node respectively.
In addition, you can know the corresponding relationship of the cluster nodes through the "ID" column. For example, web2 corresponds to the "Node 1" node in this cluster. Similarly, web1 corresponds to the "Node 4" node. Understanding the cluster node order helps interpret cluster logs.

3. ccs_tool command
ccs_tool is mainly used to manage the cluster configuration file cluster.conf. Through ccs_tool, you can add/delete nodes in the cluster, add/delete fence devices, update cluster configuration files and other operations.
The following are several application examples of ccs_tool:
After modifying the configuration file on one node, you can execute the "ccs_tool update" command to update the configuration file on all nodes, for example:
[root@web1 cluster]# ccs_tool update /etc/cluster/cluster.conf
Proposed updated config file does not have greater version number.
Current config_version :: 35
Proposed config_version:: 35
Failed to update config file.
ccs_tool determines whether to update based on the "config_version" value in cluster.conf. Therefore, after modifying the cluster.conf file, you must update the config_version value of cluster.conf so that when executing ccs_tool to update the configuration file.
[root@web1 cluster]# ccs_tool update /etc/cluster/cluster.conf
Config file updated from version 35 to 36
Update complete.

5. Management and maintenance of GFS2 files System
The GFS2 file system provides many management and maintenance tools. Commonly used ones include gfs2_fsck, gfs2_tool, gfs2_jadd, gfs2_quota, gfs2_convert, etc. Here we focus on the usage of the first three commands.

1. The gfs2_fsck command
is similar to the fsck.ext3 command under the ext3 file system and is mainly used to detect and repair file system errors. In fact, GFS2 also has a fsck.gfs2 command, which is exactly the same as the gfs2_fsck command.
The usage of gfs2_fsck is as follows:
gfs2_fsck [-afhnpqvVy]
Here are a few usage examples:
[root@Mysql1 ~]# gfs2_fsck -y /dev/sdb5
Initializing fsck
Validating Resource Group index.
Level 1 RG check.
(level 1 passed)
Starting pass1
Starting pass1c
Pass1c complete
…………
Pass5 complete
gfs2_fsck complete

2. gfs2_tool command
The gfs2_tool command has many parameters, but its use is not complicated. It is mainly used to view and modify related parameter information of the GFS2 file system.
Here are a few usage examples:
1) View GFS2 file system mounting information
[root@web1 ~]# gfs2_tool df /gfs2
/gfs2:
SB lock proto = " lock_dlm"
SB lock table = "mycluster:my-gfs2"
SB ondisk format = 1801
SB multihost format = 1900
Block size = 4096
Journals = 4
Resource Groups = 19
Mounted lock proto = "lock_dlm"
Mounted lock table = "mycluster:my-gfs2"
Mounted host data = "jid=2:id=65539:first=0"
Journal number = 2
Lock module flags = 0
Local flocks = FALSE
Local caching = FALSE

Type Total Blocks Used Blocks Free Blocks use%
------- -------- ----------------- ------------- ------------ ---- -----------
data 1220724 136578 1084146 11%
inodes 1084263 117 1084146 0%

(2) gfs2_tool command
2) lock To unlock the GFS2 file system:
[root@node1 gfs2]# gfs2_tool freeze /gfs2
[root@node1 gfs2]# gfs2_tool unfreeze /gfs2
After the GFS2 file system is locked, read and write operations cannot be performed. until it is unlocked.
3) Query the number of mountable contacts of GFS2
[root@web1 ~]# gfs2_tool journals /gfs2
journal2 - 128MB
journal3 - 128MB
journal1 - 128MB
journal0 - 128MB
4 journal(s) found.
This shows that the number of mountable nodes is 4, and the size of each journal is 128M.
4) Display GFS2 version information:
[root@web1 ~]# gfs2_tool version
gfs2_tool 0.1.62 (built Mar 31 2010 07:34:25)
Copyright (C) Red Hat , Inc. 2004-2006 All rights reserved

(3) gfs2-jadd command
gfs2-jadd is mainly used to configure the number and size of GFS2 Journals. The usage is very simple:
gfs2_jadd [-cDhJjqV ] /path/to/filesystem
Here are a few use cases:
Set the size of Journals to 64M
[root@Mysql1 ~]# gfs2_jadd -J 64M
Mount GFS2 to nodes that can be mounted at the same time The number increases to 5
[root@Mysql1 ~]# gfs2_jadd -j 5 /gfs2
In addition, gfs2_quota is used for GFS2 file system disk quota management, and gfs2_convert is a data conversion application that can convert GFS files The system's metadata is updated, converting it to a GFS2 file system. To learn more about their use, please refer to the help information, which is not covered here.