


Example code for multi-process and process pool (Processing library) in python
This article mainly introduces the detailed explanation of python's multi-process and process pool (Processing library), which is of great practical value. Friends who need it can refer to it
Environment:win7+python2.7
I have always wanted to learn multi-process or multi-threading, but before I just looked at some basic knowledge and a simple introduction, I couldn't understand how to apply it, until I saw a crawler project on github some time ago that involved multiple Process, multi-threading related content, while reading Baidu related knowledge points, now write down some relevant knowledge points and some applications for a record.
First of all, let’s talk about what a process is: a process is a program on the computer Once an activity is executed, when a program is run, a process is started. The process is divided into system process and user process. As long as the process is used to complete various functions of the operating system, it is a system process, and they are in a running state. The operating system itself; and all processes started by you are user processes. A process is the unit by which the operating system allocates resources.
Intuitively speaking, the user name marked system in the task manager is the system process, and the one marked administrator is the user process. In addition, net is netro, and lcacal service is local service. More specific information about the process. You can encyclopedia, you have to save some effort here, otherwise you will not be able to take it back.
1. Simple use of multi-process
As shown in the figure, multiprocessing has multiple functions, many I haven’t understood it yet, so I’ll just talk about what I know so far.
Process Creation:Process(target=mainly running function, name= The custom process name does not need to be written, args=(parameter))
Method:
is_alive(): Determine whether the process is alive
-
join([timeout]): The child process ends before executing the next step. Timeout is the timeout. Sometimes the process is blocked. The timeout is set so that the program can continue to run.
run(): If you do not specify a target when creating a Process object, the run method of Process will be executed by default
-
start(): Start the process, distinguish run()
terminate(): Terminate the process. Terminating the process is not that simple. It seems that it would be better to use the psutil package. I will write down more when I have the opportunity.
Among them, Process starts a process with start().
Attributes:
authkey: Find this sentence in the authkey() function in the document: Set authorization key of process. Authorization key, no relevant application examples have been found so far. How is this key used? The article does not mention it
daemon: It automatically terminates after the parent process terminates, and it cannot generate new ones. Process must be set before start()
- ##exitcode: The process is None when running. If it is -N, it means it was terminated by signal N
- name: The name of the process, customized
- pid: Each process has a unique PID number.
1.Process(),start(),join()
# -*- coding:utf-8 -*- from multiprocessing import Process import time def fun1(t): print 'this is fun1',time.ctime() time.sleep(t) print 'fun1 finish',time.ctime() def fun2(t): print 'this is fun2',time.ctime() time.sleep(t) print 'fun2 finish',time.ctime() if name == 'main': a=time.time() p1=Process(target=fun1,args=(4,)) p2 = Process(target=fun2, args=(6,)) p1.start() p2.start() p1.join() p2.join() b=time.time() print 'finish',b-a
this is fun2 Mon Jun 05 13:48:04 2017 this is fun1 Mon Jun 05 13:48:04 2017 fun1 finish Mon Jun 05 13:48:08 2017 fun2 finish Mon Jun 05 13:48:10 2017 finish 6.20300006866 Process finished with exit code 0
# -*- coding:utf-8 -*- from multiprocessing import Process import time def fun1(t): print 'this is fun1',time.ctime() time.sleep(t) print 'fun1 finish',time.ctime() def fun2(t): print 'this is fun2',time.ctime() time.sleep(t) print 'fun2 finish',time.ctime() if name == 'main': a=time.time() p1=Process(target=fun1,args=(4,)) p2 = Process(target=fun2, args=(6,)) p1.start() p1.join() p2.start() p2.join() b=time.time() print 'finish',b-a
this is fun1 Mon Jun 05 14:19:28 2017 fun1 finish Mon Jun 05 14:19:32 2017 this is fun2 Mon Jun 05 14:19:32 2017 fun2 finish Mon Jun 05 14:19:38 2017 finish 10.1229999065 Process finished with exit code 0
# -*- coding:utf-8 -*- from multiprocessing import Process import time def fun1(t): print 'this is fun1',time.ctime() time.sleep(t) print 'fun1 finish',time.ctime() def fun2(t): print 'this is fun2',time.ctime() time.sleep(t) print 'fun2 finish',time.ctime() if name == 'main': a=time.time() p1=Process(target=fun1,args=(4,)) p2 = Process(target=fun2, args=(6,)) p1.start() p2.start() p1.join() #p2.join() b=time.time() print 'finish',b-a
this is fun1 Mon Jun 05 14:23:57 2017 this is fun2 Mon Jun 05 14:23:58 2017 fun1 finish Mon Jun 05 14:24:01 2017 finish 4.05900001526 fun2 finish Mon Jun 05 14:24:04 2017 Process finished with exit code 0
2.name,daemon,is_alive():
# -*- coding:utf-8 -*- from multiprocessing import Process import time def fun1(t): print 'this is fun1',time.ctime() time.sleep(t) print 'fun1 finish',time.ctime() def fun2(t): print 'this is fun2',time.ctime() time.sleep(t) print 'fun2 finish',time.ctime() if name == 'main': a=time.time() p1=Process(name='fun1进程',target=fun1,args=(4,)) p2 = Process(name='fun2进程',target=fun2, args=(6,)) p1.daemon=True p2.daemon = True p1.start() p2.start() p1.join() print p1,p2 print '进程1:',p1.is_alive(),'进程2:',p2.is_alive() #p2.join() b=time.time() print 'finish',b-a
this is fun2 Mon Jun 05 14:43:49 2017 this is fun1 Mon Jun 05 14:43:49 2017 fun1 finish Mon Jun 05 14:43:53 2017 <Process(fun1进程, stopped daemon)> <Process(fun2进程, started daemon)> 进程1: False 进程2: True finish 4.06500005722 Process finished with exit code 0
3.run()
run()When Process does not specify a target function, the run() function is used by default to run the program,# -*- coding:utf-8 -*- from multiprocessing import Process import time def fun1(t): print 'this is fun1',time.ctime() time.sleep(t) print 'fun1 finish',time.ctime() def fun2(t): print 'this is fun2',time.ctime() time.sleep(t) print 'fun2 finish',time.ctime() if name == 'main': a = time.time() p=Process() p.start() p.join() b = time.time() print 'finish', b - a
finish 0.0840001106262
# -*- coding:utf-8 -*- from multiprocessing import Process import time def fun1(): print 'this is fun1',time.ctime() time.sleep(2) print 'fun1 finish',time.ctime() def fun2(t): print 'this is fun2',time.ctime() time.sleep(t) print 'fun2 finish',time.ctime() if name == 'main': a = time.time() p=Process() p.run=fun1 p.start() p.join() b = time.time() print 'finish', b - a
this is fun1 Mon Jun 05 16:34:41 2017 fun1 finish Mon Jun 05 16:34:43 2017 finish 2.11500000954 Process finished with exit code 0
# -*- coding:utf-8 -*- from multiprocessing import Process import time def fun1(t): print 'this is fun1',time.ctime() time.sleep(t) print 'fun1 finish',time.ctime() def fun2(t): print 'this is fun2',time.ctime() time.sleep(t) print 'fun2 finish',time.ctime() if name == 'main': a = time.time() p=Process() p.run=fun1(2) p.start() p.join() b = time.time() print 'finish', b - a
this is fun1 Mon Jun 05 16:36:27 2017 fun1 finish Mon Jun 05 16:36:29 2017 Process Process-1: Traceback (most recent call last): File "E:\Anaconda2\lib\multiprocessing\process.py", line 258, in _bootstrap self.run() TypeError: 'NoneType' object is not callable finish 2.0529999733 Process finished with exit code 0
二.Process Pool
对于需要使用几个甚至十几个进程时,我们使用Process还是比较方便的,但是如果要成百上千个进程,用Process显然太笨了,multiprocessing提供了Pool类,即现在要讲的进程池,能够将众多进程放在一起,设置一个运行进程上限,每次只运行设置的进程数,等有进程结束,再添加新的进程
Pool(processes =num):设置运行进程数,当一个进程运行完,会添加新的进程进去
apply_async(函数,(参数)):非阻塞,其中参数是tulpe类型,
apply(函数,(参数)):阻塞
close():关闭pool,不能再添加新的任务
terminate():结束运行的进程,不再处理未完成的任务
join():和Process介绍的作用一样, 但要在close或terminate之后使用。
1.单个进程池
# -*- coding:utf-8 -*- from multiprocessing import Pool import time def fun1(t): print 'this is fun1',time.ctime() time.sleep(t) print 'fun1 finish',time.ctime() def fun2(t): print 'this is fun2',time.ctime() time.sleep(t) print 'fun2 finish',time.ctime() if name == 'main': a=time.time() pool = Pool(processes =3) # 可以同时跑3个进程 for i in range(3,8): pool.apply_async(fun1,(i,)) pool.close() pool.join() b=time.time() print 'finish',b-a
结果:
this is fun1 Mon Jun 05 15:15:38 2017 this is fun1 Mon Jun 05 15:15:38 2017 this is fun1 Mon Jun 05 15:15:38 2017 fun1 finish Mon Jun 05 15:15:41 2017 this is fun1 Mon Jun 05 15:15:41 2017 fun1 finish Mon Jun 05 15:15:42 2017 this is fun1 Mon Jun 05 15:15:42 2017 fun1 finish Mon Jun 05 15:15:43 2017 fun1 finish Mon Jun 05 15:15:47 2017 fun1 finish Mon Jun 05 15:15:49 2017 finish 11.1370000839 Process finished with exit code 0
从上面的结果可以看到,设置了3个运行进程上限,15:15:38这个时间同时开始三个进程,当第一个进程结束时(参数为3秒那个进程),会添加新的进程,如此循环,直至进程池运行完再执行主进程语句b=time.time() print 'finish',b-a .这里用到非阻塞apply_async(),再来对比下阻塞apply()
# -*- coding:utf-8 -*- from multiprocessing import Pool import time def fun1(t): print 'this is fun1',time.ctime() time.sleep(t) print 'fun1 finish',time.ctime() def fun2(t): print 'this is fun2',time.ctime() time.sleep(t) print 'fun2 finish',time.ctime() if name == 'main': a=time.time() pool = Pool(processes =3) # 可以同时跑3个进程 for i in range(3,8): pool.apply(fun1,(i,)) pool.close() pool.join() b=time.time() print 'finish',b-a
结果:
this is fun1 Mon Jun 05 15:59:26 2017 fun1 finish Mon Jun 05 15:59:29 2017 this is fun1 Mon Jun 05 15:59:29 2017 fun1 finish Mon Jun 05 15:59:33 2017 this is fun1 Mon Jun 05 15:59:33 2017 fun1 finish Mon Jun 05 15:59:38 2017 this is fun1 Mon Jun 05 15:59:38 2017 fun1 finish Mon Jun 05 15:59:44 2017 this is fun1 Mon Jun 05 15:59:44 2017 fun1 finish Mon Jun 05 15:59:51 2017 finish 25.1610000134 Process finished with exit code 0
可以看到,阻塞是当一个进程结束后,再进行下一个进程,一般我们都用非阻塞apply_async()
2.多个进程池
上面是使用单个进程池的,对于多个进程池,我们可以用for循环,直接看代码
# -*- coding:utf-8 -*- from multiprocessing import Pool import time def fun1(t): print 'this is fun1',time.ctime() time.sleep(t) print 'fun1 finish',time.ctime() def fun2(t): print 'this is fun2',time.ctime() time.sleep(t) print 'fun2 finish',time.ctime() if name == 'main': a=time.time() pool = Pool(processes =3) # 可以同时跑3个进程 for fun in [fun1,fun2]: for i in range(3,8): pool.apply_async(fun,(i,)) pool.close() pool.join() b=time.time() print 'finish',b-a
结果:
this is fun1 Mon Jun 05 16:04:38 2017 this is fun1 Mon Jun 05 16:04:38 2017 this is fun1 Mon Jun 05 16:04:38 2017 fun1 finish Mon Jun 05 16:04:41 2017 this is fun1 Mon Jun 05 16:04:41 2017 fun1 finish Mon Jun 05 16:04:42 2017 this is fun1 Mon Jun 05 16:04:42 2017 fun1 finish Mon Jun 05 16:04:43 2017 this is fun2 Mon Jun 05 16:04:43 2017 fun2 finish Mon Jun 05 16:04:46 2017 this is fun2 Mon Jun 05 16:04:46 2017 fun1 finish Mon Jun 05 16:04:47 2017 this is fun2 Mon Jun 05 16:04:47 2017 fun1 finish Mon Jun 05 16:04:49 2017 this is fun2 Mon Jun 05 16:04:49 2017 fun2 finish Mon Jun 05 16:04:50 2017 this is fun2 Mon Jun 05 16:04:50 2017 fun2 finish Mon Jun 05 16:04:52 2017 fun2 finish Mon Jun 05 16:04:55 2017 fun2 finish Mon Jun 05 16:04:57 2017 finish 19.1670000553 Process finished with exit code 0
看到了,在fun1运行完接着运行fun2.
另外对于没有参数的情况,就直接 pool.apply_async(funtion),无需写上参数.
在学习编写程序过程,曾遇到不用if _name_ == '_main_':而直接运行程序,这样结果会出错,经查询,在Windows上要想使用进程模块,就必须把有关进程的代码写在当前.py文件的if _name_ == ‘_main_' :语句的下面,才能正常使用Windows下的进程模块。Unix/Linux下则不需要。原因有人这么说:在执行的時候,由于你写的 py 会被当成module 读进执行。所以,一定要判断自身是否为 _main_。也就是要:
if name == ‘main' : # do something.
这里我自己还搞不清楚,期待以后能够理解
The above is the detailed content of Example code for multi-process and process pool (Processing library) in python. For more information, please follow other related articles on the PHP Chinese website!

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics



VS Code extensions pose malicious risks, such as hiding malicious code, exploiting vulnerabilities, and masturbating as legitimate extensions. Methods to identify malicious extensions include: checking publishers, reading comments, checking code, and installing with caution. Security measures also include: security awareness, good habits, regular updates and antivirus software.

In VS Code, you can run the program in the terminal through the following steps: Prepare the code and open the integrated terminal to ensure that the code directory is consistent with the terminal working directory. Select the run command according to the programming language (such as Python's python your_file_name.py) to check whether it runs successfully and resolve errors. Use the debugger to improve debugging efficiency.

VS Code can run on Windows 8, but the experience may not be great. First make sure the system has been updated to the latest patch, then download the VS Code installation package that matches the system architecture and install it as prompted. After installation, be aware that some extensions may be incompatible with Windows 8 and need to look for alternative extensions or use newer Windows systems in a virtual machine. Install the necessary extensions to check whether they work properly. Although VS Code is feasible on Windows 8, it is recommended to upgrade to a newer Windows system for a better development experience and security.

VS Code can be used to write Python and provides many features that make it an ideal tool for developing Python applications. It allows users to: install Python extensions to get functions such as code completion, syntax highlighting, and debugging. Use the debugger to track code step by step, find and fix errors. Integrate Git for version control. Use code formatting tools to maintain code consistency. Use the Linting tool to spot potential problems ahead of time.

PHP is suitable for web development and rapid prototyping, and Python is suitable for data science and machine learning. 1.PHP is used for dynamic web development, with simple syntax and suitable for rapid development. 2. Python has concise syntax, is suitable for multiple fields, and has a strong library ecosystem.

VS Code is available on Mac. It has powerful extensions, Git integration, terminal and debugger, and also offers a wealth of setup options. However, for particularly large projects or highly professional development, VS Code may have performance or functional limitations.

PHP is mainly procedural programming, but also supports object-oriented programming (OOP); Python supports a variety of paradigms, including OOP, functional and procedural programming. PHP is suitable for web development, and Python is suitable for a variety of applications such as data analysis and machine learning.

The key to running Jupyter Notebook in VS Code is to ensure that the Python environment is properly configured, understand that the code execution order is consistent with the cell order, and be aware of large files or external libraries that may affect performance. The code completion and debugging functions provided by VS Code can greatly improve coding efficiency and reduce errors.
