Connection tracking is the basis of many network applications. For example, Kubernetes Service, ServiceMesh sidecar, software four-layer load balancer LVS/IPVS, Docker network, OVS, iptables host firewall, etc., all rely on the connection tracking function. Connection tracking, as the name suggests, is to track (and record) the status of the connection. For example, Figure 1.1 is a Linux machine with an IP address of 10.1.1.2. We can see that there are three connections on this machine:
The connection for the machine to access the external HTTP service (destination port 80)
The connection for the external access to the FTP service within the machine (destination port 21)
The connection of the machine to access the external DNS service (destination port 53)
What connection tracking does is to discover and Track the status of these connections, including:
Extract tuple information from the data packet, identify the data flow (flow) and the corresponding connection ( connection).
#Maintain a status database (conntrack table) for all connections, such as the creation time of the connection, the number of packets sent, the number of bytes sent, etc.
#Recycle expired connections (GC).
# Provide services for higher-level functions (such as NAT).
It should be noted that the concept of "connection" in connection tracking is the same as the concept of "connection oriented" in the TCP/IP protocol. "Connection" is not exactly the same. To put it simply:
In the TCP/IP protocol, connection is a Layer 4 concept. TCP is connection-oriented, and all packets sent require a response (ACK) from the peer, and there is a retransmission mechanism. UDP is connectionless, and the packets sent do not require a response from the peer, and there is no retransmission mechanism.
In conntrack(CT), a data flow (flow) defined by a tuple (tuple) represents a connection (connection). We will see later that three-layer protocols such as UDP and even ICMP also have connection records in CT, but not all protocols will be connected.
Netfilter
##Linux Connection tracking is implemented in Netfilter. Netfilter is a framework in the Linux kernel for controlling, modifying, and filtering data packets (manipulation and filtering). It sets several hook points in the kernel protocol stack to intercept, filter or otherwise process data packets.Now when it comes to connection tracking (conntrack), you may first think of Netfilter. Netfilter is just a connection tracking implementation in the Linux kernel. In other words, as long as you have the hook capability and can intercept every packet entering and exiting the host, you can implement a set of connection tracking on your own based on this. The cloud native network solution Cilium has implemented such an independent connection tracking and NAT mechanism in version 1.7.4 (Full functionality requires Kernel 4.19). The basic principle is:
Implement the packet interception function based on BPF hook (equivalent to the hook mechanism in netfilter)
Based on BPF hook, implement a new set of conntrack and NAT. Therefore, even if Netfilter is uninstalled, it will not affect Cilium's support for Kubernetes ClusterIP, NodePort, ExternalIPs and LoadBalancer. . Since this connection tracking mechanism is independent of Netfilter, its conntrack and NAT information are not stored in the kernel (that is, Netfilter's) conntrack table and NAT table.Therefore, conventional conntrack/netstats/ss/lsof and other tools cannot be seen. You must use Cilium commands, for example:
##
$ cilium bpf nat list$ cilium bpf ct list global
Copy after login
Iptables
Iptables is a user space tool for configuring Netfilter filtering function. Netfilter is the real security framework of the firewall, and netfilter is located in the kernel space. iptables is actually a command line tool located in user space. We use this tool to operate the real framework. Iptable processes data packets according to the methods defined by rules, such as accept, reject, drop, etc. For example, when the client accesses the server's web service, the client sends a message to the network card, and the tcp/ip protocol stack is part of the kernel, so the client's information will pass through The kernel's TCP protocol is transmitted to the web service in user space. At this time, the target endpoint of the client message is the socket (IP:Port) monitored by the web service. When the web service needs to respond to the client request , the target destination of the response message sent by the web service is the client. At this time, the IP and port monitored by the web service become the origin. We have said that netfilter is the real firewall, and it is part of the kernel. Therefore, if we want the firewall to achieve the purpose of "fire prevention", we need to set up checkpoints in the kernel. All incoming and outgoing messages must pass through these checkpoints. After inspection, only those that meet the release conditions can be released, and those that meet the blocking conditions can be released. needs to be blocked. iptables contains 4 tables and 5 chains. The table is distinguished according to the operation on the data packet (filtering, NAT, etc.), and the chain is distinguished according to different Hook points. The table and chain are actually the two dimensions of netfilter.The four tables of iptables are filter, mangle, nat, and raw. The default table is filter.
filter table: used to filter data packets. Specific rule requirements determine how to process a data packet.
nat table: mainly used to modify the IP address and port number information of data packets.
mangle table: Mainly used to modify the service type and life cycle of data packets, set tags for data packets, and implement traffic shaping, policy routing, etc.
#raw table: Mainly used to decide whether to perform status tracking on data packets.
The five chains of iptables are PREROUTING, INPUT, FORWARD, OUTPUT, and POSTROUTING.
input chain: The rules in this chain will be applied when a packet is received that accesses the local address.
output chain: When the machine sends a packet out, the rules in this chain will be applied.
forward chain: When receiving a data packet that needs to be forwarded to other addresses, the rules in this chain will be applied. Note that if you need to implement forward forwarding, you need to enable it. The ip_forward function in the Linux kernel.
prerouting chain: The rules in this chain will be applied before routing packets.
postrouting chain: The rules in this chain will be applied after routing the packet.
The corresponding relationship between the table and the chain is as shown below: We can imagine some common scenarios , the flow direction of the message:
The message to a certain process of the local machine: PREROUTING –> INPUT.
Messages forwarded by this machine: PREROUTING –> FORWARD –> POSTROUTING.
A message (usually a response message) is sent by a process on the local machine: OUTPUT –> POSTROUTING.
We can summarize the process of data packets passing through the firewall as follows:
Query rules
-t: Table name
-n: Do not resolve IP address
-v: Will display counter information, the number and size of packets
-x: Options Represents the exact value of the display counter
##--line-numbers: The serial number of the display rule (abbreviated as --line)
In addition, when searching for public accounts on Linux, this is how you should learn to reply "monkey" in the background to get a surprise gift package.
当我们通过 http 的 url 访问某个网站的网页时,客户端向服务端的 80 端口发起请求,服务端再通过 80 端口响应我们的请求,于是,作为客户端,我们似乎应该理所应当的放行 80 端口,以便服务端回应我们的报文可以进入客户端主机,于是,我们在客户端放行了 80 端口,同理,当我们通过 ssh 工具远程连接到某台服务器时,客户端向服务端的 22 号端口发起请求,服务端再通过 22 号端口响应我们的请求,于是我们理所应当的放行了所有 22 号端口,以便远程主机的响应请求能够通过防火墙,但是,作为客户端,如果我们并没有主动向 80 端口发起请求,也没有主动向 22 号端口发起请求,那么其他主机通过 80 端口或者 22 号端口向我们发送数据时,我们可以接收到吗?应该是可以的,因为我们为了收到 http 与 ssh 的响应报文,已经放行了 80 端口与 22 号端口,所以,不管是”响应”我们的报文,还是”主动发送”给我们的报文,应该都是可以通过这两个端口的,那么仔细想想,这样是不是不太安全呢?此时 state 扩展模块就派上用场了。For the connection of the state module, the messages in the "connection" can be divided into 5 states, which are:
NEW: The status of the first packet in the connection is NEW. We can understand that the status of the first packet in the new connection is NEW.
ESTABLISHED: We can understand the status of the packet after the NEW status packet as ESTABLISHED, indicating that the connection has been established.
RELATED: Literally understood, RELATED is translated as relationship, but this is still not easy to understand. Let’s give an example. For example, in the FTP service, the FTP server will create two processes, one command process and one data process. The command process is responsible for command transmission between the server and the client (we can understand this transmission process as a so-called "connection" in state, temporarily called "command connection"). The data process is responsible for data transmission between the server and the client (we temporarily call this process "data connection"). However, the specific data to be transmitted is controlled by the command. Therefore, the messages in the "data connection" are "related" to the "command connection". Then, the packets in the "data connection" may be in the RELATED state, because these packets are related to the packets in the "command connection". (Note: If you want to perform connection tracking for ftp, you need to load the corresponding kernel module nf_conntrack_ftp separately. If you want to load it automatically, you can configure the /etc/sysconfig/iptables-config file)
INVALID: If a packet cannot be identified, or the packet does not have any status, then the status of the packet is INVALID. We can actively block messages with INVALID status.
UNTRACKED: When the status of the packet is untracked, it means that the packet has not been tracked. When the status of the packet is Untracked, it usually means that the relevant connection cannot be found. .
刚才举例中的问题即可使用 state 扩展模块解决,我们只要放行状态为 ESTABLISHED 的报文即可,因为如果报文的状态为 ESTABLISHED,那么报文肯定是之前发出的报文的回应,这样,就表示只有回应我们的报文能够通过防火墙,如果是别人主动发送过来的新的报文,则无法通过防火墙:
iptables -t filter -I INPUT -m state --state ESTABLISHED -j ACCEPT
When there are too many rules in the default chain, it is inconvenient for us to manage. Imagine if there are 200 rules stored in the INPUT chain. Some of these 200 rules are for httpd service, some are for sshd service, some are for private network IP, and some are for public network IP. If we suddenly want to modify Regarding the rules related to the httpd service, do we have to read these 200 rules from the beginning to find out which rules are specific to httpd? This is obviously unreasonable. So, in iptables, you can customize the chain, and the above problems can be solved by customizing the chain. Suppose we customize a chain named IN_WEB. We can write all inbound rules for port 80 into this custom chain. When we want to modify the inbound rules for web services in the future, we can Just modify the rules in the IN_WEB chain directly. Even if there are more rules in the default chain, we will not be afraid, because we know that all inbound rules for port 80 are stored in the IN_WEB chain.
The above is the detailed content of Don't know how to use Linux firewall software IPtables! What kind of operation and maintenance person are you?. For more information, please follow other related articles on the PHP Chinese website!
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn