Thread classification
Threads can be divided into user-level threads and core-level threads according to their schedulers.
(1) User-level thread
User-level thread mainly solves the problem of context switching. Its scheduling algorithm and scheduling process are all decided by the user, and no specific kernel support is required at runtime. Here, the operating system often provides a user-space thread library, which provides thread creation, scheduling, cancellation and other functions, while the kernel still only manages the process. If a thread in a process calls a blocking system call, the process, including all other threads in the process, is also blocked. The main disadvantage of this kind of user-level thread is that it cannot take advantage of multi-processors in scheduling multiple threads in a process.
(2) Core-level threads
This kind of thread allows threads in different processes to be scheduled according to the same relative priority scheduling method, so that the concurrency advantages of multi-processors can be used.
Most systems now adopt the method of coexisting user-level threads and core-level threads. A user-level thread can correspond to one or several core-level threads, which is a "one-to-one" or "many-to-one" model. This not only meets the needs of multi-processor systems, but also minimizes scheduling overhead.
Linux thread implementation is performed outside the core, and the core provides the interface do_fork() for creating a process. The kernel provides two system calls clone() and fork(), which ultimately call the do_fork() kernel API with different parameters. Of course, if you want to implement threads, it is impossible without core support for multi-process (actually lightweight processes) shared data segments. Therefore, do_fork() provides many parameters, including CLONE_VM (shared memory space), CLONE_FS (shared file system information), CLONE_FILES (shared file descriptor table), CLONE_SIGHAND (shared signal handle table) and CLONE_PID (shared process ID, only valid for the core process, that is, process 0). When using the fork system call, the kernel calls do_fork() without using any shared attributes. The process has an independent running environment, and using When pthread_create() is used to create a thread, all these attributes are finally set to call __clone(), and all these parameters are passed to do_fork() in the core. The "process" thus created has a shared running environment and only the stack It is independent and passed in by __clone().
Linux threads exist in the form of lightweight processes within the core, with independent process table entries, and all creation, synchronization, deletion and other operations are performed in the pthread library outside the core. pthread The library uses a management thread (__pthread_manager(), each process is independent and unique) to manage the creation and termination of threads, assign thread IDs to threads, and send thread-related signals (such as Cancel), while the main thread (pthread_create()) The caller passes the request information to the management thread through the pipeline.
Main function description
1. Thread creation and exit
pthread_create thread creation function
int pthread_create (pthread_t * thread_id,__const
pthread_attr_t * __attr,void *(*__start_routine) (void *),void *__restrict
__arg);
The first parameter of the thread creation function is a pointer to the thread identifier, the second parameter is used to set the thread attributes, the third parameter is the starting address of the thread running function, and the last parameter is the running function parameter. Here, our function thread No parameters are required, so the last parameter is set to a null pointer. We also set the second parameter to a null pointer, which will generate a thread with default attributes. When the thread creation is successful, the function returns 0, if not 0 It means that the thread creation failed, and the common error return code is EAGAIN and EINVAL. The former means that the system restricts the creation of new threads, for example, the number of threads is too many; the latter means that the thread attribute value represented by the second parameter is illegal. After the thread is successfully created, the newly created thread runs the function determined by parameter three and parameter four, and the original thread continues to run the next line of code.
pthread_join function to wait for the end of a thread.
The function prototype is: int pthread_join (pthread_t __th, void
**__thread_return)
The first parameter is the identifier of the thread being waited for, and the second parameter is a user-defined pointer, which can be used to store the return value of the thread being waited for. This function is a thread-blocking function. The function calling it will wait until the waiting thread ends. When the function returns, the resources of the waiting thread are recovered. A thread can only be terminated by one thread and should be in the joinable state (not detached).
pthread_exit
Function
There are two ways to end a thread. One is that when the function running by the thread ends, the thread that called it also ends;
The other way is through the function pthread_exit
to fulfill. Its function prototype is: void pthread_exit (void *__retval). The only parameter is the return code of the function. As long as pthread_join
The second parameter thread_return in
Not NULL, this value will be passed to thread_return. The last thing to note is that a thread cannot be waited by multiple threads, otherwise the first thread to receive the signal returns successfully, and the rest call pthread_join
The thread returns error code ESRCH.
2. Thread properties
The second parameter of the pthread_create function is the properties of the thread. Set this value to NULL, that is, use the default attributes. Many attributes of the thread can be changed. These properties mainly include binding properties, separation properties, stack address, stack size, and priority. The system default attributes are non-binding, non-detached, and the default is 1M.
stack and the same priority level as the parent process. The following first explains the basic concepts of bound attributes and detached attributes.
Binding attributes: Linux adopts a "one-to-one" thread mechanism, that is, one user thread corresponds to one kernel thread. The binding attribute means that a user thread is fixedly assigned to a kernel thread, because the scheduling of CPU time slices is for kernel threads.
(that is, a lightweight process), so threads with binding attributes can ensure that there is always a kernel thread corresponding to it when needed. In contrast, the non-binding attribute means that the relationship between user threads and kernel threads is not always fixed, but is controlled and allocated by the system.
Detachment attribute: The detachment attribute is used to determine how a thread terminates itself. In the case of non-detachment, when a thread ends, the system resources it occupies are not released, that is, there is no real termination. Only when the pthread_join() function returns, can the created thread release the system resources it occupies. In the case of detached attributes, the system resources it occupies are immediately released when a thread ends.
One thing to note here is that if you set the detachment attribute of a thread, and this thread runs very fast, then it is likely to be in pthread_create
The function terminates before it returns. After it terminates, the thread number and system resources may be handed over to other threads. At this time, the thread calling pthread_create will get the wrong thread number.
Set binding attributes:
int pthread_attr_init(pthread_attr_t *attr)
int
pthread_attr_setscope(pthread_attr_t *attr, int scope)
int
pthread_attr_getscope(pthread_attr_t *tattr, int
*scope)
scope: PTHREAD_SCOPE_SYSTEM: binding, this thread competes with all threads in the system
PTHREAD_SCOPE_PROCESS: Unbound, this thread competes with other threads in the process
Set the detachment attribute:
int pthread_attr_setdetachstate(pthread_attr_t *attr, int detachstate)
int
pthread_attr_getdetachstate(const pthread_attr_t *tattr,int
*detachstate)
detachstate PTHREAD_CREATE_DETACHED: Detach PTHREAD
_CREATE_JOINABLE: non-detached
Set scheduling policy:
int pthread_attr_setschedpolicy(pthread_attr_t * tattr, int policy)
int
pthread_attr_getschedpolicy(pthread_attr_t *tattr, int *policy)
policy
SCHED_FIFO: First in, first out SCHED_RR: Loop SCHED_OTHER: Implementation-defined method
Set priority:
int pthread_attr_setschedparam (pthread_attr_t *attr, struct sched_param
*param)
int pthread_attr_getschedparam (pthread_attr_t *attr, struct
sched_param *param)
3. Thread access control
1) Mutex lock (mutex)
achieves synchronization between threads through the lock mechanism. Only one thread is allowed to execute one critical section of code at a time.
1 int pthread_mutex_init(pthread_mutex_t *mutex,const pthread_mutex_attr_t
*mutexattr);
2 int pthread_mutex_lock(pthread_mutex_t *mutex);
3 int
pthread_mutex_unlock(pthread_mutex_t *mutex);
4 int
pthread_mutex_destroy(pthread_mutex_t *mutex);
(1) First initialize the lock init() or statically assign pthread_mutex_t
mutex=PTHREAD_MUTEX_INITIALIER
(2) Lock, lock, trylock, lock block waiting for the lock, trylock returns EBUSY immediately
(3) Unlock, unlock must be in the locked state, and unlocked by the locking thread
(4) Clear the lock, destroy (at this time the lock must be unlocked, otherwise EBUSY will be returned)
mutex is divided into two types: recursive and non-recursive. This is the POSIX name. The other name is reentrant.
Non-reentrant. There is no difference between these two mutexes as inter-thread synchronization tools. Their only difference is that the same thread can repeatedly mutex
The mutex is locked, but the non-recursive mutex cannot be locked repeatedly.
Lock.
Non-recursive mutex is preferred, definitely not for performance, but to reflect design intent. non-recursive and recursive
The performance difference between the two is actually not big, because one less counter is used, and the former is just a little faster. Repeating non-recursive mutex multiple times in the same thread
Locking will immediately cause a deadlock. I think this is its advantage. It can help us think about the lock requirements of the code and detect problems early (in the coding phase). no doubt recursive mutex
It is more convenient to use, because you don't have to worry about a thread locking itself. I guess this is why Java and Windows provide recursive mutex by default. (Java
The intrinsic lock that comes with the language is reentrant, and its concurrent
The library provides ReentrantLock, and Windows' CRITICAL_SECTION is also reentrant. It seems that none of them offer lightweight non-recursive
mutex. )
2) Condition variable (cond)
A mechanism that uses global variables shared between threads for synchronization.
1 int pthread_cond_init(pthread_cond_t *cond,pthread_condattr_t *cond_attr);
2 int pthread_cond_wait(pthread_cond_t *cond,pthread_mutex_t *mutex);
3
int pthread_cond_timedwait(pthread_cond_t *cond,pthread_mutex_t *mutex,const
timespec *abstime);
4 int pthread_cond_destroy(pthread_cond_t *cond);
5
int pthread_cond_signal(pthread_cond_t *cond);
6 int
pthread_cond_broadcast(pthread_cond_t *cond); //Unblock all threads
(1) Initialization. init() or pthread_cond_t
cond=PTHREAD_COND_INITIALIER; Set the attribute to NULL
(2) Wait for the condition to be established.
pthread_cond_wait,pthread_cond_timedwait.
wait() releases the lock and blocks waiting for the condition variable to be true.
timedwait() sets the waiting time. If there is still no signal, it returns to ETIMEOUT (locking ensures that there is only one thread wait).
(3) Activate condition variables: pthread_cond_signal, pthread_cond_broadcast (activate All waiting threads)
(4) Clear condition variables: destroy;
There is no thread waiting, otherwise EBUSY will be returned
int pthread_cond_wait(pthread_cond_t *cond, pthread_mutex_t *mutex); int pthread_cond_timedwait(pthread_cond_t *cond, pthread_mutex_t *mutex, const struct timespec *abstime);
These two functions must be used within the mutex lock area.
Call pthread_cond_signal() When releasing a conditionally blocked thread, calling pthread_cond_signal() has no effect if no thread is blocked based on the condition variable. For Windows, when calling When SetEvent triggers the Auto-reset Event condition, if there is no thread blocked by the condition, this function will still work and the condition variable will be in the triggered state.
Producer-consumer problem under Linux (using mutex locks and condition variables):
#include <stdio.h> #include <stdlib.h> #include <time.h> #include "pthread.h" #define BUFFER_SIZE 16 struct prodcons { int buffer[BUFFER_SIZE]; pthread_mutex_t lock; //mutex ensuring exclusive access to buffer int readpos,writepos; //position for reading and writing pthread_cond_t notempty; //signal when buffer is not empty pthread_cond_t notfull; //signal when buffer is not full }; //initialize a buffer void init(struct prodcons* b) { pthread_mutex_init(&b->lock,NULL); pthread_cond_init(&b->notempty,NULL); pthread_cond_init(&b->notfull,NULL); b->readpos = 0; b->writepos = 0; } //store an integer in the buffer void put(struct prodcons* b, int data) { pthread_mutex_lock(&b->lock); //wait until buffer is not full while((b->writepos+1)%BUFFER_SIZE == b->readpos) { printf("wait for not full\n"); pthread_cond_wait(&b->notfull,&b->lock); } b->buffer[b->writepos] = data; b->writepos++; b->writepos %= BUFFER_SIZE; pthread_cond_signal(&b->notempty); //signal buffer is not empty pthread_mutex_unlock(&b->lock); } //read and remove an integer from the buffer int get(struct prodcons* b) { int data; pthread_mutex_lock(&b->lock); //wait until buffer is not empty while(b->writepos == b->readpos) { printf("wait for not empty\n"); pthread_cond_wait(&b->notempty,&b->lock); } data=b->buffer[b->readpos]; b->readpos++; b->readpos %= BUFFER_SIZE; pthread_cond_signal(&b->notfull); //signal buffer is not full pthread_mutex_unlock(&b->lock); return data; } #define OVER -1 struct prodcons buffer; void * producer(void * data) { int n; for(n=0; n<50; ++n) { printf("put-->%d\n",n); put(&buffer,n); } put(&buffer,OVER); printf("producer stopped\n"); return NULL; } void * consumer(void * data) { int n; while(1) { int d = get(&buffer); if(d == OVER) break; printf("get-->%d\n",d); } printf("consumer stopped\n"); return NULL; } int main() { pthread_t tha,thb; void * retval; init(&buffer); pthread_creare(&tha,NULL,producer,0); pthread_creare(&thb,NULL,consumer,0); pthread_join(tha,&retval); pthread_join(thb,&retval); return 0; }
3) Semaphores
Like processes, threads can also communicate through semaphores, although they are lightweight.
The names of semaphore functions all start with "sem_". There are four basic semaphore functions used by threads.
#include <semaphore.h> int sem_init(sem_t *sem , int pshared, unsigned int value);
This is to initialize the semaphore specified by sem, set its sharing option (linux only supports 0, which means it is a local semaphore of the current process), and then give it an initial value VALUE.
Two atomic operation functions: Both functions take a pointer to a semaphore object initialized by the sem_init call as a parameter.
int sem_wait(sem_t *sem); //给信号量减1,对一个值为0的信号量调用sem_wait,这个函数将会等待直到有其它线程使它不再是0为止。 int sem_post(sem_t *sem); //给信号量的值加1 int sem_destroy(sem_t *sem);
The function of this function is to clean up the semaphore after we use it. Return all the resources you possess.
Use semaphores to implement producers and consumers:
Four semaphores are used here, two of which, occupied and empty, are used to solve the synchronization problem between producer and consumer threads respectively, and pmut is used for multiple productions The mutual exclusion problem between consumers, cmut is used for the mutual exclusion problem between multiple consumers. Among them, empty is initialized to N (the number of space elements of the bounded buffer), occupied is initialized to 0, and pmut and cmut are initialized to 1.
Reference code:
#define BSIZE 64 typedef struct { char buf[BSIZE]; sem_t occupied; sem_t empty; int nextin; int nextout; sem_t pmut; sem_t cmut; }buffer_t; buffer_t buffer; void init(buffer_t * b) { sem_init(&b->occupied, 0, 0); sem_init(&b->empty,0, BSIZE); sem_init(&b->pmut, 0, 1); sem_init(&b->cmut, 0, 1); b->nextin = b->nextout = 0; } void producer(buffer_t *b, char item) { sem_wait(&b->empty); sem_wait(&b->pmut); b->buf[b->nextin] = item; b->nextin++; b->nextin %= BSIZE; sem_post(&b->pmut); sem_post(&b->occupied); } char consumer(buffer_t *b) { char item; sem_wait(&b->occupied); sem_wait(&b->cmut); item = b->buf[b->nextout]; b->nextout++; b->nextout %= BSIZE; sem_post(&b->cmut); sem_post(&b->empty); return item; }