Detailed explanation of SOCKET programming in Linux-PHP开发-php.cn

1. How to communicate between processes in the network

The concept of process communication originally came from stand-alone systems. Since each process runs within its own address range, in order to ensure that two mutually communicating processes do not interfere with each other and work in a coordinated manner, the operating system provides corresponding facilities for process communication, such as

UNIX BSD There are: pipe (pipe), named pipe (named pipe) soft interrupt signal (signal)

UNIX system V has: message (message), shared storage area (shared memory) and semaphore (semaphore), etc.

They all Only used for communication between native processes. Internet process communication aims to solve the problem of mutual communication between different host processes (process communication on the same machine can be regarded as a special case). To this end, the first thing to solve is the problem of process identification between networks. On the same host, different processes can be uniquely identified by process IDs. However, in a network environment, the process number assigned independently by each host cannot uniquely identify the process. For example, host A assigns process number 5 to a certain process, and process number 5 can also exist in host B. Therefore, the sentence "process number 5" is meaningless. Secondly, the operating system supports many network protocols, and different protocols work in different ways and have different address formats. Therefore, inter-network process communication must also solve the problem of identifying multiple protocols.

In fact, the TCP/IP protocol suite has helped us solve this problem. The "ip address" of the network layer can uniquely identify the host in the network, while the "protocol + port" of the transport layer can uniquely identify the application (process) in the host. ). In this way, the triplet (ip address, protocol, port) can be used to identify the network process, and process communication in the network can use this mark to interact with other processes.

Applications that use the TCP/IP protocol usually use application programming interfaces: sockets of UNIX BSD and TLI of UNIX System V (already obsolete) to achieve communication between network processes. For now, almost all applications use sockets, and now is the Internet era. Process communication on the network is ubiquitous. This is why I say "everything is socket".

2. What are TCP/IP and UDP

TCP/IP (Transmission Control Protocol/Internet Protocol) is a transmission control protocol/internet protocol. It is an industrial standard protocol set. It is designed for wide area networks (WANs).

The TCP/IP protocol exists in the OS, and network services are provided through the OS. System calls that support TCP/IP are added to the OS - Berkeley sockets, such as Socket, Connect, Send, Recv, etc.

UDP (User Data Protocol (User Datagram Protocol) is the protocol corresponding to TCP. It is a member of the TCP/IP protocol suite. As shown in the figure:

Detailed explanation of SOCKET programming in Linux The TCP/IP protocol suite includes the transport layer, network layer, and link layer, and the location of the socket is as shown in the figure. Socket is the intermediate software abstraction layer for communication between the application layer and the TCP/IP protocol suite.

Detailed explanation of SOCKET programming in Linux

3. What is Socket

1. Socket:

Socket originated from Unix, and one of the basic philosophies of Unix/Linux is that "everything is a file" and can be opened with " open –> read and write write/read –> close” mode to operate. Socket is an implementation of this mode. Socket is a special file, and some socket functions are operations on it (read/write IO, open, close).

To put it bluntly, Socket is the application layer and TCP/IP protocol family An intermediate software abstraction layer for communication, which is a set of interfaces. In the design mode, Socket is actually a facade mode, which hides the complex TCP/IP protocol family behind the Socket interface. For users, a set of simple interfaces is all, allowing Socket to organize data to comply with the specified protocol.

Note: In fact, socket does not have the concept of layers. It is just an application of the facade design pattern, which makes programming easier. It is a software abstraction layer. In network programming, we use a lot of sockets.

2. Socket descriptor

are actually just an integer. The three handles we are most familiar with are 0, 1, and 2. 0 is the standard input, 1 is the standard output, and 2 is the standard error output. 0, 1, and 2 are represented by integers, and the corresponding FILE * structures are represented by stdin, stdout, stderr

The socket API was originally developed as part of the UNIX operating system, so the socket API is the same as Other I/O devices of the system are integrated together. In particular, when an application creates a socket for Internet communication, the operating system returns a small integer as a descriptor to identify the socket. The application then passes the descriptor as a parameter and calls a function to complete some operation (such as transmitting data over the network or receiving incoming data).

In many operating systems, socket descriptors and other I/O descriptors are integrated, so applications can perform socket I/O or I/O read/write operations on files.

When an application wants to create a socket, the operating system returns a small integer as a descriptor, and the application uses this descriptor to refer to the socket. The application that requires I/O requests requests the operating system to open one. document. The operating system creates a file descriptor for the application to access the file. From an application's perspective, a file descriptor is an integer that an application can use to read and write files. The figure below shows how the operating system implements a file descriptor as an array of pointers that point to internal data structures.

Detailed explanation of SOCKET programming in Linux

There is a separate table for each program system. To be precise, the system maintains a separate file descriptor table for each running process. When a process opens a file, the system writes a pointer to the file's internal data structure into the file descriptor table and returns the index value of the table to the caller. The application only needs to remember this descriptor and use it when manipulating the file in the future. The operating system uses this descriptor as an index to access the process descriptor table, and uses the pointer to find the data structure that holds all the information about the file.

System data structure for sockets:

1). There is a function socket in the socket API, which is used to create a socket. The general idea of socket design is that a single system call can create any socket, since sockets are quite general. Once the socket is created, the application needs to call other functions to specify the specific details. For example, calling socket will create a new descriptor entry:

Detailed explanation of SOCKET programming in Linux

2) Although the internal data structure of the socket contains many fields, most of the fields are not filled in after the system creates the socket. After the application creates the socket, it must call other procedures to populate these fields before the socket can be used.

3. The difference between file descriptors and file pointers:

File descriptor: When you open a file in a Linux system, you will get a file descriptor, which is a small positive integer. Each process stores a file descriptor table in the PCB (Process Control Block). The file descriptor is the index of this table. Each table entry has a pointer to an open file.

File pointer: File pointer is used as the handle of I/O in C language. The file pointer points to a data structure called the FILE structure in the user area of the process. The FILE structure includes a buffer and a file descriptor. The file descriptor is an index into the file descriptor table, so in a sense the file pointer is the handle of the handle (on Windows systems, the file descriptor is called a file handle).

4. Basic SOCKET interface function

In life, A wants to call B, A dials the number, B hears the ringing tone and picks up the phone, then A and B establish a connection, A and B B can then speak. When the communication is over, hang up the phone to end the conversation. The call explained how this works in a simple way: "open-write/read-close" mode.

Detailed explanation of SOCKET programming in Linux

The server first initializes the Socket, then binds to the port, listens to the port, calls accept to block, and waits for the client to connect. At this time, if a client initializes a Socket and then connects to the server (connect), if the connection is successful, the connection between the client and the server is established. The client sends a data request, the server receives the request and processes the request, then sends the response data to the client, the client reads the data, and finally closes the connection, and the interaction ends.

The implementation of these interfaces is completed by the kernel. For details on how to implement it, you can look at the Linux kernel

4.1, socket() function

int socket(int protofamily, int type, int protocol);//Return sockfd

sockfd is the descriptor.

The socket function corresponds to the opening operation of an ordinary file. The ordinary file open operation returns a file descriptor, and socket() is used to create a socket descriptor (socket descriptor), which uniquely identifies a socket. This socket descriptor is the same as the file descriptor. It is used in subsequent operations. It is used as a parameter to perform some read and write operations.

Just like you can pass in different parameter values to fopen to open different files. When creating a socket, you can also specify different parameters to create different socket descriptors. The three parameters of the socket function are:

protofamily: that is, the protocol domain, also known as the protocol family (family). Commonly used protocol families include AF_INET (IPV4), AF_INET6 (IPV6), AF_LOCAL (or AF_UNIX, Unix domain socket), AF_ROUTE, etc. The protocol family determines the address type of the socket, and the corresponding address must be used in communication. For example, AF_INET determines to use a combination of ipv4 address (32-bit) and port number (16-bit), and AF_UNIX determines to use an absolute path. Name as address.

type: Specify the socket type. Commonly used socket types include SOCK_STREAM, SOCK_DGRAM, SOCK_RAW, SOCK_PACKET, SOCK_SEQPACKET, etc. (What are the types of sockets?).

protocol: hence the name, it means a designated protocol. Commonly used protocols include IPPROTO_TCP, IPPTOTO_UDP, IPPROTO_SCTP, IPPROTO_TIPC, etc., which respectively correspond to the TCP transmission protocol, UDP transmission protocol, STCP transmission protocol, and TIPC transmission protocol (I will discuss this protocol separately!).

Note: The above type and protocol cannot be combined at will. For example, SOCK_STREAM cannot be combined with IPPROTO_UDP. When protocol is 0, the default protocol corresponding to the type type is automatically selected.

When we call socket to create a socket, the returned socket descriptor exists in the protocol family (address family, AF_XXX) space, but does not have a specific address. If you want to assign an address to it, you must call the bind() function, otherwise the system will automatically assign a port randomly when calling connect() or listen().

4.2. bind() function

As mentioned above, the bind() function assigns a specific address in an address family to the socket. For example, corresponding to AF_INET and AF_INET6, an ipv4 or ipv6 address and port number combination is assigned to the socket.

int bind(int sockfd, const struct sockaddr *addr, socklen_t addrlen); The three parameters of the

function are:

sockfd: the socket descriptor, which is created through the socket() function and uniquely identifies it. a socket. The bind() function will bind a name to this descriptor.

addr: a const struct sockaddr * pointer pointing to the protocol address to be bound to sockfd. This address structure varies according to the address protocol family when the address creates the socket. For example, ipv4 corresponds to:

struct sockaddr_in {
sa_family_t sin_family; /* address family: AF_INET */
in_port_t sin_port; /* port in network byte order */
struct in_addr sin_addr; /* internet address */
};

/* Internet address. The corresponding ipv6 is:

struct sockaddr_in6 {
sa_family_t sin6_family; /* AF_INET6 */

in_port_t sin6_port; /* port number */

uint32_t sin6_flowinfo; /* IPv6 flow information */
struct in6_addr sin6_addr; /* IPv6 address */

uint32_t sin6_scope_id; /* Scope ID (new in 2.4) */

};

struct in6_addr {
unsigned char s6_addr[16]; /* IPv6 address */
};

Unix domain corresponds to :

#define UNIX_PATH_MAX 108

struct sockaddr_un {
sa_family_t sun_family; /* AF_UNIX */
char sun_path[UNIX_PATH_MAX]; /* pathname */
};

addrlen: corresponds to the length of the address.

Usually the server will be bound to a well-known address (such as IP address + port number) when it is started to provide services, and customers can connect to the server through it; the client does not need to specify it, the system automatically assigns one Port number and its own IP address combination. This is why the server usually calls bind() before listening, but the client does not call it. Instead, the system randomly generates one during connect().

Network byte order and host byte order

Host byte order is what we usually call big endian and little endian modes: different CPUs have different byte order types, these byte order refers to the integers in memory The order in which they are saved is called host order. The standard definitions of Big-Endian and Little-Endian are quoted as follows:

　a) Little-Endian means that the low-order bytes are arranged at the low address end of the memory, and the high-order bytes are arranged at the high address end of the memory.

　 b) Big-Endian means that the high-order bytes are arranged at the low address end of the memory, and the low-order bytes are arranged at the high address end of the memory.

Network byte order: 4-byte 32-bit values are transmitted in the following order: first 0~7bit, then 8~15bit, then 16~23bit, and finally 24~31bit. This transfer order is called big-endian. Because all binary integers in the TCP/IP header are required to be in this order when transmitted over the network, it is also called network byte order. Byte order, as the name suggests, is the order in which data larger than one byte is stored in memory. There is no order issue with data of one byte.

So: When binding an address to a socket, please first convert the host byte order to network byte order, and do not assume that the host byte order uses Big-Endian like the network byte order. There have been murders caused by this problem! This problem has caused many inexplicable problems in the company's project code, so please remember not to make any assumptions about the host byte order, and be sure to convert it into network byte order before assigning it to the socket.

4.3, listen(), connect() function

If you are a server, after calling socket(), bind(), listen() will be called to listen to the socket. If the client calls connect() at this time, it will issue Connection request, the server will receive this request.

int listen(int sockfd, int backlog);
int connect(int sockfd, const struct sockaddr *addr, socklen_t addrlen);

The first parameter of the listen function is the socket descriptor to be listened to, and the second Each parameter is the maximum number of connections that can be queued by the corresponding socket. The socket created by the socket() function is an active type by default, and the listen function changes the socket to a passive type, waiting for the client's connection request.

The first parameter of the connect function is the client's socket descriptor, the second parameter is the server's socket address, and the third parameter is the length of the socket address. The client establishes a connection with the TCP server by calling the connect function.

4.4, accept() function

After the TCP server calls socket(), bind(), and listen() in sequence, it will listen to the specified socket address. After calling socket() and connect() in sequence, the TCP client sends a connection request to the TCP server. After the TCP server monitors this request, it will call the accept() function to receive the request, so that the connection is established. Then you can start network I/O operations, which are similar to ordinary file read and write I/O operations.

int accept(int sockfd, struct sockaddr *addr, socklen_t *addrlen); //Return connection connect_fd

parameter sockfd

parameter sockfd is the listening socket explained above. This socket is used Listening on a port, when a client connects to the server, it uses this port number, and this port number is associated with this socket. Of course the client doesn't know the details of the socket, it only knows an address and a port number.

Parameter addr

This is a result parameter, which is used to accept a return value. This return value specifies the address of the client. Of course, this address is described through an address structure. The user should know what kind of address this is. structure. If you are not interested in the customer's address, you can set this value to NULL.

Parameter len

As everyone thinks, it is also a parameter of the result. It is used to accept the size of the above addr structure. It specifies the number of bytes occupied by the addr structure. Likewise, it can also be set to NULL.

If accept returns successfully, the server and client have correctly established a connection. At this time, the server completes communication with the client through the socket returned by accept.

Note:

Accept will block the process by default until a client connection is established and return. It returns a newly available socket, which is a connection socket.

At this point we need to distinguish between two types of sockets,

Listening socket: The listening socket is just like the parameter sockfd of accept. It is a listening socket. After calling the listen function, the server starts to call the socket() function to generate it. It is called the listening socket descriptor (listening socket Word)

Connecting socket: A socket will transform from an actively connected socket to a listening socket; and the accept function returns the connected socket descriptor (a connected socket), which Represents a point-to-point connection that already exists on the network.

A server usually only creates a listening socket descriptor, which always exists during the life cycle of the server. The kernel creates a connected socket descriptor for each client connection accepted by the server process. When the server completes serving a client, the corresponding connected socket descriptor is closed.

The natural question to ask is: why are there two types of sockets? The reason is simple. If you use a descriptor, it has too many functions, making its use very unintuitive. At the same time, such a new descriptor is indeed generated in the kernel.

Connecting socket socketfd_new does not occupy a new port to communicate with the client. It still uses the same port number as the listening socket socketfd

4.5, read(), write() and other functions

Everything is ready All we owe is Dongfeng, and a connection between the server and the client has been established. Network I/O can be called for read and write operations, which means communication between different processes in the network is realized! Network I/O operations have the following groups:

read()/write()

recv()/send()

readv()/writev()

recvmsg()/sendmsg()

recvfrom( )/sendto()

I recommend using the recvmsg()/sendmsg() function. These two functions are the most common I/O functions. In fact, you can replace all the other functions above with these two functions. Their declarations are as follows:

#include

ssize_t read(int fd, void *buf, size_t count);
ssize_t write(int fd, const void *buf, size_t count);

# include
#include

ssize_t send(int sockfd, const void *buf, size_t len, int flags);
ssize_t recv(int sockfd, void *buf , size_t len, int flags);

ssize_t sendto(int sockfd, const void *buf, size_t len, int flags,
const struct sockaddr *dest_addr, socklen_t addrlen);
s size_t recvfrom(int sockfd, void *buf, size_t len, int flags,
struct sockaddr *src_addr, socklen_t *addrlen);

ssize_t sendmsg(int sockfd, const struct msghdr *msg, int flags);
ssize_t recv msg(int sockfd, struct msghdr *msg, int flags);

The read function is responsible for reading content from fd. When the read is successful, read returns the actual number of bytes read. If the returned value is 0, it means that the end of the file has been read. If it is less than 0, it means that an error has occurred. . If the error is EINTR, it means that the read was caused by an interrupt. If it is ECONNREST, it means there is a problem with the network connection.

The write function writes the nbytes bytes content in buf to the file descriptor fd. When successful, it returns the number of bytes written. On failure, -1 is returned and the errno variable is set. In network programs, there are two possibilities when we write to the socket file descriptor. 1) The return value of write is greater than 0, indicating that part or all of the data has been written. 2) The returned value is less than 0, and an error occurred. We have to deal with it according to the error type. If the error is EINTR, it means that an interrupt error occurred during writing. If it is EPIPE, it means there is a problem with the network connection (the other party has closed the connection).

I will not introduce these pairs of I/O functions one by one. For details, please refer to the man document or Baidu or Google. Send/recv will be used in the following example.

4.6. close() function

After the server establishes a connection with the client, some read and write operations will be performed. After completing the read and write operations, the corresponding socket descriptor must be closed, such as calling fclose to close the opened file after the operation. open file.

#include
int close(int fd);

The default behavior of close is to mark the socket as closed and then immediately return to the calling process. This descriptor can no longer be used by the calling process, that is, it can no longer be used as the first parameter of read or write.

Note: The close operation only reduces the reference count of the corresponding socket descriptor by -1. Only when the reference count is 0 will the TCP client be triggered to send a termination request to the server.

5. Establishment of TCP in Socket (three-way handshake)

TCP protocol completes the establishment of the connection through three message segments. This process is called three-way handshake handshake), the process is shown in the figure below.

First handshake: When establishing a connection, the client sends a syn packet (syn=j) to the server and enters the SYN_SEND state, waiting for confirmation by the server; SYN: Synchronize Sequence Numbers.

Second handshake: The server receives the syn packet and must confirm the client's SYN (ack=j+1). At the same time, it also sends a SYN packet (syn=k), that is, SYN+ACK packet. At this time, the server enters SYN_RECV status;
Third handshake: The client receives the SYN+ACK packet from the server and sends a confirmation packet ACK (ack=k+1) to the server. After the packet is sent, the client and server enter the ESTABLISHED state, completed three times shake hands.
A complete three-way handshake is: request--response--confirm again.

Corresponding function interface:

Detailed explanation of SOCKET programming in Linux

As can be seen from the figure, when the client calls connect, a connection request is triggered and a SYN J packet is sent to the server. At this time, connect enters the blocking state; the server monitors the connection. Request, that is, receive the SYN J packet, call the accept function to receive the request and send SYN K, ACK J+1 to the client. At this time, accept enters the blocking state; after the client receives the server's SYN K, ACK J+1, at this time connect returns and confirms SYN K; when the server receives ACK K+1, accept returns. At this point, the three-way handshake is completed and the connection is established.

We can view the specific process through network packet capture:

For example, our server opens port 9502. Use tcpdump to capture packets:

tcpdump -iany tcp port 9502

Then we use telnet 127.0.0.1 9502 to open the connection.:

telnet 127.0.0.1 9502

14:12:45.104687 IP localhost.39870 > localhost.9502: Flags [S], seq 2927179378, win 32792, options [mss 16396, sackOK, TS val 255474104 ecr 0, nop, wscale 3], length 0 (1)
14:12: 45.104701 IP localhost.9502 > localhost.39870: Flags [S.], seq 1721825043, ack 2927179379, win 32768, options [mss 16396, sackOK, TS val 255474104 ecr 255474104, nop, wscale 3] , length 0 (2)
14:12:45.104711 IP localhost.39870 > localhost.9502: Flags [.], ack 1, win 4099, options [nop,nop,TS val 255474104 ecr 255474104], length 0 (3)

14: 13:01.415407 IP localhost.39870 > localhost.9502: Flags [P.], seq 1:8, ack 1, win 4099, options [nop,nop,TS val 255478182 ecr 255474104], length 7
14:13: 01.415432 IP localhost.9502 > localhost.39870: Flags [.], ack 8, win 4096, options [nop,nop,TS val 255478182 ecr 255478182], length 0
14:13:01.415747 IP localhost.9502 > localhost .39870: Flags [P.], seq 1:19, ack 8, win 4096, options [nop,nop,TS val 255478182 ecr 255478182], length 18
14:13:01.415757 IP localhost.39870 > localhost.9502 : Flags [.], ACK 19, Win 4097, Options [NOP, NOP, TS Val 255478182 ECR 255478182], Length 0

114: 12: 45.104687 time with accurate to delicate

Localhost. 39870 > localhost.9502 represents the flow of communication, 39870 is the client, 9502 is the server

[S] means this is a SYN request

[S.] means this is a SYN+ACK confirmation package:

[.] means this is an ACT confirmation packet, (client)SYN->(server)SYN->(client)ACT is the 3-way handshake process

[P] means this is a data push, which can be from the server It can be pushed from the client to the client, or from the client to the server.

[F] means that this is a FIN packet, which is a connection closing operation. Client/server may initiate it.

[R] means that this is an RST package. , has the same effect as the F package, but RST indicates that when the connection is closed, there is still data that has not been processed. It can be understood as forcibly cutting off the connection

win 4099 refers to the sliding window size

length 18 refers to the size of the data packet

We see that (1) (2) (3) the three steps are to establish tcp:

First handshake:

14:12:45.104687 IP localhost.39870 > localhost.9502: Flags [S], seq 2927179378

Client IP localhost.39870 (the client's port is usually automatically assigned) to the server localhost .9502 Send syn package (syn=j) to the server》

syn package (syn=j): syn seq= 2927179378 (j=2927179378)

Second handshake:

14:12:45.104701 IP localhost.9502 > localhost.39870: Flags [S.], seq 1721825043, ack 2927179379,

Receive request and confirm: The server receives the syn packet and must confirm the client's SYN (ack=j+1), and at the same time It also sends a SYN packet (syn=k), that is, SYN+ACK packet:
At this time, the server host’s own SYN: seq: y= syn seq 1721825043.
ACK is j+1 = (ack=j+1) =ack 2927179379

The third handshake:

14:12:45.104711 IP localhost.39870 > localhost.9502: Flags [.], ack 1.

The client receives the SYN+ACK packet from the server and sends an acknowledgment packet ACK (ack=k+1) to the server

After the client and server enter the ESTABLISHED state, they can exchange communication data. This time has nothing to do with the accept interface. Even if there is no accepte, the three-way handshake is completed.

连接出现连接不上的问题，一般是网路出现问题或者网卡超负荷或者是连接数已经满啦。

紫色背景的部分：

IP localhost.39870 > localhost.9502: Flags [P.], seq 1:8, ack 1, win 4099, options [nop,nop,TS val 255478182 ecr 255474104], length 7

客户端向服务器发送长度为7个字节的数据，

IP localhost.9502 > localhost.39870: Flags [.], ack 8, win 4096, options [nop,nop,TS val 255478182 ecr 255478182], length 0

服务器向客户确认已经收到数据

IP localhost.9502 > localhost.39870: Flags [P.], seq 1:19, ack 8, win 4096, options [nop,nop,TS val 255478182 ecr 255478182], length 18

然后服务器同时向客户端写入数据。

IP localhost.39870 > localhost.9502: Flags [.], ack 19, win 4097, options [nop,nop,TS val 255478182 ecr 255478182], length 0

客户端向服务器确认已经收到数据

这个就是tcp可靠的连接，每次通信都需要对方来确认。

6. Detailed explanation of SOCKET programming in Linux

建立一个连接需要三次握手，而终止一个连接要经过四次握手，这是由TCP的半关闭(half-close)造成的，如图：

Detailed explanation of SOCKET programming in Linux

由于TCP连接是全双工的，因此每个方向都必须单独进行关闭。这个原则是当一方完成它的数据发送任务后就能发送一个FIN来终止这个方向的连接。收到一个 FIN只意味着这一方向上没有数据流动，一个TCP连接在收到一个FIN后仍能发送数据。首先进行关闭的一方将执行主动关闭，而另一方执行被动关闭。

（1）客户端A发送一个FIN，用来关闭客户A到服务器B的数据传送（报文段4）。

（2）服务器B收到这个FIN，它发回一个ACK，确认序号为收到的序号加1（报文段5）。和SYN一样，一个FIN将占用一个序号。

（3）服务器B关闭与客户端A的连接，发送一个FIN给客户端A（报文段6）。

（4）客户端A发回ACK报文确认，并将确认序号设置为收到序号加1（报文段7）。

Detailed explanation of SOCKET programming in Linux如图：

Detailed explanation of SOCKET programming in Linux

过程如下：

某个应用进程首先调用close主动关闭连接，这时TCP发送一个FIN M；

另一端接收到FIN M之后，执行被动关闭，对这个FIN进行确认。它的接收也作为文件结束符传递给应用进程，因为FIN的接收意味着应用进程在相应的连接上再也接收不到额外数据；

一段时间之后，接收到文件结束符的应用进程调用close关闭它的socket。这导致它的TCP也发送一个FIN N；

接收到这个FIN的源发送端TCP对它进行确认。

这样每个方向上都有一个FIN和ACK。

1．为什么建立连接协议是三次握手，而关闭连接却是四次握手呢？

这是因为服务端的LISTEN状态下的SOCKET当收到SYN报文的建连请求后，它可以把ACK和SYN（ACK起应答作用，而SYN起同步作用）放在一个报文里来发送。但关闭连接时，当收到对方的FIN报文通知时，它仅仅表示对方没有数据发送给你了；但未必你所有的数据都全部发送给对方了，所以你可以未必会马上会关闭SOCKET,也即你可能还需要发送一些数据给对方之后，再发送FIN报文给对方来表示你同意现在可以关闭连接了，所以它这里的ACK报文和FIN报文多数情况下都是分开发送的。

2．为什么TIME_WAIT状态还需要等2MSL后才能返回到CLOSED状态？

这是因为虽然双方都同意关闭连接了，而且握手的4个报文也都协调和发送完毕，按理可以直接回到CLOSED状态（就好比从SYN_SEND状态到ESTABLISH状态那样）；但是因为我们必须要假想网络是不可靠的，你无法保证你最后发送的ACK报文会一定被对方收到，因此对方处于LAST_ACK状态下的SOCKET可能会因为超时未收到ACK报文，而重发FIN报文，所以这个TIME_WAIT状态的作用就是用来重发可能丢失的ACK报文。

7. Socket编程实例

服务器端：一直监听本机的8000号端口，如果收到连接请求，将接收请求并接收客户端发来的消息，并向客户端返回消息。

/* File Name: server.c */  
#include<stdio.h>  
#include<stdlib.h>  
#include<string.h>  
#include<errno.h>  
#include<sys/types.h>  
#include<sys/socket.h>  
#include<netinet/in.h>  
#define DEFAULT_PORT 8000  
#define MAXLINE 4096  
int main(int argc, char** argv)  
{  
    int    socket_fd, connect_fd;  
    struct sockaddr_in     servaddr;  
    char    buff[4096];  
    int     n;  
    //初始化Socket  
    if( (socket_fd = socket(AF_INET, SOCK_STREAM, 0)) == -1 ){  
    printf("create socket error: %s(errno: %d)\n",strerror(errno),errno);  
    exit(0);  
    }  
    //初始化  
    memset(&servaddr, 0, sizeof(servaddr));  
    servaddr.sin_family = AF_INET;  
    servaddr.sin_addr.s_addr = htonl(INADDR_ANY);//IP地址设置成INADDR_ANY,让系统自动获取本机的IP地址。  
    servaddr.sin_port = htons(DEFAULT_PORT);//设置的端口为DEFAULT_PORT  
  
    //将本地地址绑定到所创建的套接字上  
    if( bind(socket_fd, (struct sockaddr*)&servaddr, sizeof(servaddr)) == -1){  
    printf("bind socket error: %s(errno: %d)\n",strerror(errno),errno);  
    exit(0);  
    }  
    //开始监听是否有客户端连接  
    if( listen(socket_fd, 10) == -1){  
    printf("listen socket error: %s(errno: %d)\n",strerror(errno),errno);  
    exit(0);  
    }  
    printf("======waiting for client&#39;s request======\n");  
    while(1){  
//阻塞直到有客户端连接，不然多浪费CPU资源。  
        if( (connect_fd = accept(socket_fd, (struct sockaddr*)NULL, NULL)) == -1){  
        printf("accept socket error: %s(errno: %d)",strerror(errno),errno);  
        continue;  
    }  
//接受客户端传过来的数据  
    n = recv(connect_fd, buff, MAXLINE, 0);  
//向客户端发送回应数据  
    if(!fork()){ /*紫禁城*/  
        if(send(connect_fd, "Hello,you are connected!\n", 26,0) == -1)  
        perror("send error");  
        close(connect_fd);  
        exit(0);  
    }  
    buff[n] = &#39;\0&#39;;  
    printf("recv msg from client: %s\n", buff);  
    close(connect_fd);  
    }  
    close(socket_fd);  
}

Copy after login

客户端：

/* File Name: client.c */  
  
#include<stdio.h>  
#include<stdlib.h>  
#include<string.h>  
#include<errno.h>  
#include<sys/types.h>  
#include<sys/socket.h>  
#include<netinet/in.h>  
  
#define MAXLINE 4096  
  
  
int main(int argc, char** argv)  
{  
    int    sockfd, n,rec_len;  
    char    recvline[4096], sendline[4096];  
    char    buf[MAXLINE];  
    struct sockaddr_in    servaddr;  
  
  
    if( argc != 2){  
    printf("usage: ./client <ipaddress>\n");  
    exit(0);  
    }  
  
  
    if( (sockfd = socket(AF_INET, SOCK_STREAM, 0)) < 0){  
    printf("create socket error: %s(errno: %d)\n", strerror(errno),errno);  
    exit(0);  
    }  
  
  
    memset(&servaddr, 0, sizeof(servaddr));  
    servaddr.sin_family = AF_INET;  
    servaddr.sin_port = htons(8000);  
    if( inet_pton(AF_INET, argv[1], &servaddr.sin_addr) <= 0){  
    printf("inet_pton error for %s\n",argv[1]);  
    exit(0);  
    }  
  
  
    if( connect(sockfd, (struct sockaddr*)&servaddr, sizeof(servaddr)) < 0){  
    printf("connect error: %s(errno: %d)\n",strerror(errno),errno);  
    exit(0);  
    }  
  
  
    printf("send msg to server: \n");  
    fgets(sendline, 4096, stdin);  
    if( send(sockfd, sendline, strlen(sendline), 0) < 0)  
    {  
    printf("send msg error: %s(errno: %d)\n", strerror(errno), errno);  
    exit(0);  
    }  
    if((rec_len = recv(sockfd, buf, MAXLINE,0)) == -1) {  
       perror("recv error");  
       exit(1);  
    }  
    buf[rec_len]  = &#39;\0&#39;;  
    printf("Received : %s ",buf);  
    close(sockfd);  
    exit(0);  
}

Copy after login

inet_pton 是Linux下IP地址转换函数，可以在将IP地址在“点分十进制”和“整数”之间转换，是inet_addr的扩展。

int inet_pton(int af, const char *src, void *dst);//转换字符串到网络地址:

Copy after login

第一个参数af是地址族，转换后存在dst中
af = AF_INET:src为指向字符型的地址，即ASCII的地址的首地址（ddd.ddd.ddd.ddd格式的），函数将该地址转换为in_addr的结构体，并复制在*dst中
　　af =AF_INET6:src为指向IPV6的地址，函数将该地址转换为in6_addr的结构体，并复制在*dst中
如果函数出错将返回一个负值，并将errno设置为EAFNOSUPPORT，如果参数af指定的地址族和src格式不对，函数将返回0。

测试：

编译server.c

gcc -o server server.c

启动进程：

./server

显示结果：

======waiting for client's request======

并等待客户端连接。

编译 client.c

gcc -o client server.c

客户端去连接server：

./client 127.0.0.1

等待输入消息

Detailed explanation of SOCKET programming in Linux

发送一条消息，输入：c++

Detailed explanation of SOCKET programming in Linux

此时服务器端看到：

Detailed explanation of SOCKET programming in Linux

客户端收到消息：

Detailed explanation of SOCKET programming in Linux

其实可以不用client,可以使用telnet来测试：

telnet 127.0.0.1 8000

Detailed explanation of SOCKET programming in Linux

注意：

在ubuntu 编译源代码的时候，头文件types.h可能找不到。
使用dpkg -L libc6-dev | grep types.h 查看。
如果没有，可以使用
apt-get install libc6-dev安装。
如果有了，但不在/usr/include/sys/目录下，手动把这个文件添加到这个目录下就可以了。