In order to better master the application of Python timers, we also added background knowledge about Python classes, context managers and decorators. Due to space limitations, the use of context managers and decorators to optimize Python timers will be studied in subsequent articles and is not within the scope of this article.
First, we add a Python timer to a piece of code to monitor its performance.
The built-in time[1] module in Python has several functions that can measure time:
Python 3.7 introduces several new functions, such as thread_time()[2], and nanosecond versions of all the above functions, named with the _ns suffix. For example, perf_counter_ns() is the nanosecond version of perf_counter().
perf_counter()
Returns the value in seconds of the performance counter, i.e. the clock with the highest available resolution for measuring short durations.
First, create a Python timer using perf_counter(). Will compare it with other Python timer functions to see the advantages of perf_counter().
Create a script and define a short function: download a set of data from Tsinghua Cloud.
import requests def main(): source_url = 'https://cloud.tsinghua.edu.cn/d/e1ccfff39ad541908bae/files/?p=%2Fall_six_datasets.zip&dl=1' headers = {'User-Agent': 'Mozilla/5.0'} res = requests.get(source_url, headers=headers) with open('dataset/datasets.zip', 'wb') as f: f.write(res.content) if __name__=="__main__": main()
We can use Python timers to monitor the performance of this script.
Now use the function time.perf_counter() function to create a timer, which is a counter that is very suitable for performance timing of part of the code.
perf_counter() measures time in seconds starting at some unspecified moment, which means that the return value from a single call to this function is of no use. But when looking at the difference between two calls to perf_counter(), you can calculate how many seconds passed between the two calls.
>>> import time >>> time.perf_counter() 394.540232282 >>> time.perf_counter()# 几秒钟后 413.31714087
In this example, the two calls to perf_counter() are nearly 19 seconds apart. This can be confirmed by calculating the difference between the two outputs: 413.31714087 - 394.540232282 = 18.78.
You can now add a Python timer to the example code:
# download_data.py import requests import time def main(): tic = time.perf_counter() source_url = 'https://cloud.tsinghua.edu.cn/d/e1ccfff39ad541908bae/files/?p=%2Fall_six_datasets.zip&dl=1' headers = {'User-Agent': 'Mozilla/5.0'} res = requests.get(source_url, headers=headers) with open('dataset/datasets.zip', 'wb') as f: f.write(res.content) toc = time.perf_counter() print(f"该程序耗时: {toc - tic:0.4f} seconds") if __name__=="__main__": main()
Note that perf_counter() prints the time it took the entire program to run by calculating the difference between the two calls.
The f in front of the string in the print() function indicates that this is an f-string. This is a more convenient way to format text strings. :0.4f is a format specifier that represents a number that toc - tic should be printed as a decimal number with four decimal places.
Run the program and you can see the elapsed time of the program:
该程序耗时: 0.026 seconds
It’s that simple. Next, let’s learn how to wrap a Python timer into a class, a context manager, and a decorator (two subsequent articles in this series, to be updated), so that the timer can be used more consistently and conveniently.
Here we need at least one variable to store the status of the Python timer. Next we create a class that is identical to calling perf_counter() manually, but more readable and consistent.
Create and update a Timer class and use it to time your code in many different ways.
$ python -m pip install codetiming
Class Classes are the main building blocks of object-oriented programming. A class is essentially a template that can be used to create objects.
In Python, classes are useful when you need to model something that needs to track a specific state. Generally speaking, a class is a collection of attributes, called properties, and behaviors, called methods.
The class is useful for tracking status. In the Timer class, you want to track when the timer starts and how much time has elapsed. For the first implementation of the Timer class, a ._start_time attribute and .start() and .stop() methods will be added. Add the following code to a file called timer.py:
# timer.py import time class TimerError(Exception): """一个自定义异常,用于报告使用Timer类时的错误""" class Timer: def __init__(self): self._start_time = None def start(self): """Start a new timer""" if self._start_time is not None: raise TimerError(f"Timer is running. Use .stop() to stop it") self._start_time = time.perf_counter() def stop(self): """Stop the timer, and report the elapsed time""" if self._start_time is None: raise TimerError(f"Timer is not running. Use .start() to start it") elapsed_time = time.perf_counter() - self._start_time self._start_time = None print(f"Elapsed time: {elapsed_time:0.4f} seconds")
Here we need to take a moment to look through the code carefully and we will find a few different things.
First define a TimerError Python class. The (Exception) symbol indicates that TimerError inherits from another parent class named Exception. Use this built-in class for error handling. There is no need to add any properties or methods to TimerError, but custom errors provide more flexibility in handling Timer internal issues.
接下来自定义Timer类。当从一个类创建或实例化一个对象时,代码会调用特殊方法.__init__()初始化实例。在这里定义的第一个Timer版本中,只需初始化._start_time属性,将用它来跟踪 Python 计时器的状态,计时器未运行时它的值为None。计时器运行后,用它来跟踪计时器的启动时间。
注意: ._start_time的第一个下划线(_)前缀是Python约定。它表示._start_time是Timer类的用户不应该操作的内部属性。
当调用.start()启动新的 Python 计时器时,首先检查计时器是否运行。然后将perf_counter()的当前值存储在._start_time中。
另一方面,当调用.stop()时,首先检查Python计时器是否正在运行。如果是,则将运行时间计算为perf_counter()的当前值与存储在._start_time中的值的差值。最后,重置._start_time,以便重新启动计时器,并打印运行时间。
以下是使用Timer方法:
from timer import Timer t = Timer() t.start() # 几秒钟后 t.stop()
Elapsed time: 3.8191 seconds
将此示例与前面直接使用perf_counter()的示例进行比较。代码的结构相似,但现在代码更清晰了,这也是使用类的好处之一。通过仔细选择类、方法和属性名称,可以使你的代码非常具有描述性!
现在Timer类中写入download_data.py。只需要对以前的代码进行一些更改:
# download_data.py import requests from timer import Timer def main(): t = Timer() t.start() source_url = 'https://cloud.tsinghua.edu.cn/d/e1ccfff39ad541908bae/files/?p=%2Fall_six_datasets.zip&dl=1' headers = {'User-Agent': 'Mozilla/5.0'} res = requests.get(source_url, headers=headers) with open('dataset/datasets.zip', 'wb') as f: f.write(res.content) t.stop() if __name__=="__main__": main()
注意,该代码与之前使用的代码非常相似。除了使代码更具可读性之外,Timer还负责将经过的时间打印到控制台,使得所用时间的记录更加一致。运行代码时,得到的输出几乎相同:
Elapsed time: 0.502 seconds ...
打印经过的时间Timer可能是一致的,但这种方法好像不是很灵活。下面我们添加一些更加灵活的东西到代码中。
到目前为止,我们已经了解到类适用于我们想要封装状态并确保代码一致性的情况。在本节中,我们将一起给 Python 计时器加入更多便利性和灵活性,那怎么做呢?
首先,自定义用于报告所用时间的文本。在前面的代码中,文本 f"Elapsed time: {elapsed_time:0.4f} seconds" 被生硬编码到 .stop() 中。如若想使得类代码更加灵活, 可以使用实例变量,其值通常作为参数传递给.__init__()并存储到 self 属性。为方便起见,我们还可以提供合理的默认值。
要添加.text为Timer实例变量,可执行以下操作timer.py:
# timer.py def __init__(self, text="Elapsed time: {:0.4f} seconds"): self._start_time = None self.text = text
注意,默认文本"Elapsed time: {:0.4f} seconds"是作为一个常规字符串给出的,而不是f-string。这里不能使用f-string,因为f-string会立即计算,当你实例化Timer时,你的代码还没有计算出消耗的时间。
注意: 如果要使用f-string来指定.text,则需要使用双花括号来转义实际经过时间将替换的花括号。
如:f"Finished {task} in {{:0.4f}} seconds"。如果task的值是"reading",那么这个f-string将被计算为"Finished reading in {:0.4f} seconds"。
在.stop()中,.text用作模板并使用.format()方法填充模板:
# timer.py def stop(self): """Stop the timer, and report the elapsed time""" if self._start_time is None: raise TimerError(f"Timer is not running. Use .start() to start it") elapsed_time = time.perf_counter() - self._start_time self._start_time = None print(self.text.format(elapsed_time))
在此更新为timer.py之后,可以将文本更改如下:
from timer import Timer t = Timer(text="You waited {:.1f} seconds") t.start() # 几秒钟后 t.stop()
You waited 4.1 seconds
接下来,我们不只是想将消息打印到控制台,还想保存时间测量结果,这样可以便于将它们存储在数据库中。可以通过从.stop()返回elapsed_time的值来实现这一点。然后,调用代码可以选择忽略该返回值或保存它以供以后处理。
如果想要将Timer集成到日志logging中。要支持计时器的日志记录或其他输出,需要更改对print()的调用,以便用户可以提供自己的日志记录函数。这可以用类似于你之前定制的文本来完成:
# timer.py # ... class Timer: def __init__( self, text="Elapsed time: {:0.4f} seconds", logger=print ): self._start_time = None self.text = text self.logger = logger # 其他方法保持不变 def stop(self): """Stop the timer, and report the elapsed time""" if self._start_time is None: raise TimerError(f"Timer is not running. Use .start() to start it") elapsed_time = time.perf_counter() - self._start_time self._start_time = None if self.logger: self.logger(self.text.format(elapsed_time)) return elapsed_time
不是直接使用print(),而是创建另一个实例变量 self.logger,引用一个接受字符串作为参数的函数。除此之外,还可以对文件对象使用logging.info()或.write()等函数。还要注意if中,它允许通过传递logger=None来完全关闭打印。
以下是两个示例,展示了新功能的实际应用:
from timer import Timer import logging t = Timer(logger=logging.warning) t.start() # 几秒钟后 t.stop()# A few seconds later
WARNING:root:Elapsed time: 3.1610 seconds 3.1609658249999484
t = Timer(logger=None) t.start() # 几秒钟后 value = t.stop() value
4.710851433001153
接下来第三个改进是积累时间度量的能力。例如,在循环中调用一个慢速函数时,希望以命名计时器的形式添加更多的功能,并使用一个字典来跟踪代码中的每个Python计时器。
我们扩展download_data.py脚本。
# download_data.py import requests from timer import Timer def main(): t = Timer() t.start() source_url = 'https://cloud.tsinghua.edu.cn/d/e1ccfff39ad541908bae/files/?p=%2Fall_six_datasets.zip&dl=1' headers = {'User-Agent': 'Mozilla/5.0'} for i in range(10): res = requests.get(source_url, headers=headers) with open('dataset/datasets.zip', 'wb') as f: f.write(res.content) t.stop() if __name__=="__main__": main()
这段代码的一个微妙问题是,不仅要测量下载数据所需的时间,还要测量 Python 存储数据到磁盘所花费的时间。这可能并重要,有时候这两者所花费的时间可以忽略不计。但还是希望有一种方法可以精确地计时没一个步骤,将会更好。
有几种方法可以在不改变Timer当前实现的情况下解决这个问题,且只需要几行代码即可实现。
首先,将引入一个名为.timers的字典作为Timer的类变量,此时Timer的所有实例将共享它。通过在任何方法之外定义它来实现它:
class Timer: timers = {}
类变量可以直接在类上访问,也可以通过类的实例访问:
>>> from timer import Timer >>> Timer.timers {} >>> t = Timer() >>> t.timers {} >>> Timer.timers is t.timers True
在这两种情况下,代码都返回相同的空类字典。
接下来向 Python 计时器添加可选名称。可以将该名称用于两种不同的目的:
要向Python计时器添加名称,需要对 timer.py 进行更改。首先,Timer 接受 name 参数。第二,当计时器停止时,运行时间应该添加到 .timers 中:
# timer.py # ... class Timer: timers = {} def __init__( self, name=None, text="Elapsed time: {:0.4f} seconds", logger=print, ): self._start_time = None self.name = name self.text = text self.logger = logger # 向计时器字典中添加新的命名计时器 if name: self.timers.setdefault(name, 0) # 其他方法保持不变 def stop(self): """Stop the timer, and report the elapsed time""" if self._start_time is None: raise TimerError(f"Timer is not running. Use .start() to start it") elapsed_time = time.perf_counter() - self._start_time self._start_time = None if self.logger: self.logger(self.text.format(elapsed_time)) if self.name: self.timers[self.name] += elapsed_time return elapsed_time
注意,在向.timers中添加新的Python计时器时,使用了.setdefault()方法。它只在没有在字典中定义name的情况下设置值,如果name已经在.timers中使用,那么该值将保持不变,此时可以积累几个计时器:
>>> from timer import Timer >>> t = Timer("accumulate") >>> t.start() >>> t.stop()# A few seconds later Elapsed time: 3.7036 seconds 3.703554293999332 >>> t.start() >>> t.stop()# A few seconds later Elapsed time: 2.3449 seconds 2.3448921170001995 >>> Timer.timers {'accumulate': 6.0484464109995315}
现在可以重新访问download_data.py并确保仅测量下载数据所花费的时间:
# download_data.py import requests from timer import Timer def main(): t = Timer("download", logger=None) source_url = 'https://cloud.tsinghua.edu.cn/d/e1ccfff39ad541908bae/files/?p=%2Fall_six_datasets.zip&dl=1' headers = {'User-Agent': 'Mozilla/5.0'} for i in range(10): t.start() res = requests.get(source_url, headers=headers) t.stop() with open('dataset/datasets.zip', 'wb') as f: f.write(res.content) download_time = Timer.timers["download"] print(f"Downloaded 10 dataset in {download_time:0.2f} seconds") if __name__=="__main__": main()
现在你有了一个非常简洁的版本,Timer它一致、灵活、方便且信息丰富!也可以将本节中所做的许多改进应用于项目中的其他类型的类。
最后一个改进Timer,以交互方式使用它时使其更具信息性。下面操作是实例化一个计时器类,并查看其信息:
>>> from timer import Timer >>> t = Timer() >>> t <timer.Timer object at 0x7f0578804320>
最后一行是 Python 表示对象的默认方式。我们从这个结果中看到的信息,并不是很明确,我们接下来对其进行改进。
这里介绍一个 dataclasses 类,该类仅包含在 Python 3.7 及更高版本中。
pip install dataclasses
可以使用@dataclass装饰器将 Python 计时器转换为数据类
# timer.py import time from dataclasses import dataclass, field from typing import Any, ClassVar # ... @dataclass class Timer: timers: ClassVar = {} name: Any = None text: Any = "Elapsed time: {:0.4f} seconds" logger: Any = print _start_time: Any = field(default=None, init=False, repr=False) def __post_init__(self): """Initialization: add timer to dict of timers""" if self.name: self.timers.setdefault(self.name, 0) # 其余代码不变
此代码替换了之前的 .__init__() 方法。请注意数据类如何使用类似于之前看到的用于定义所有变量的类变量语法的语法。事实上,.__init__()是根据类定义中的注释变量自动为数据类创建的。
如果需要注释变量以使用数据类。可以使用此注解向代码添加类型提示。如果不想使用类型提示,那么可以使用 Any 来注释所有变量。接下来我们很快就会学习如何将实际类型提示添加到我们的数据类中。
以下是有关 Timer 数据类的一些注意事项:
新 Timer 数据类与之前的常规类使用功能一样,但它现在有一个很好的信息表示:
from timer import Timer t = Timer() t
Timer(name=None, text='Elapsed time: {:0.4f} seconds', logger=<built-in function print>)
t.start() # 几秒钟后 t.stop()
Elapsed time: 6.7197 seconds 6.719705373998295
现在我们有了一个非常简洁的 Timer 版本,它一致、灵活、方便且信息丰富!我们还可以将本文中所做的许多改进应用于项目中的其他类型的类。
现在我们访问当前的完整源代码Timer。会注意到在代码中添加了类型提示以获取额外的文档:
# timer.py from dataclasses import dataclass, field import time from typing import Callable, ClassVar, Dict, Optional class TimerError(Exception): """A custom exception used to report errors in use of Timer class""" @dataclass class Timer: timers: ClassVar[Dict[str, float]] = {} name: Optional[str] = None text: str = "Elapsed time: {:0.4f} seconds" logger: Optional[Callable[[str], None]] = print _start_time: Optional[float] = field(default=None, init=False, repr=False) def __post_init__(self) -> None: """Add timer to dict of timers after initialization""" if self.name is not None: self.timers.setdefault(self.name, 0) def start(self) -> None: """Start a new timer""" if self._start_time is not None: raise TimerError(f"Timer is running. Use .stop() to stop it") self._start_time = time.perf_counter() def stop(self) -> float: """Stop the timer, and report the elapsed time""" if self._start_time is None: raise TimerError(f"Timer is not running. Use .start() to start it") # Calculate elapsed time elapsed_time = time.perf_counter() - self._start_time self._start_time = None # Report elapsed time if self.logger: self.logger(self.text.format(elapsed_time)) if self.name: self.timers[self.name] += elapsed_time return elapsed_time
总结下: 使用类创建 Python 计时器有几个好处:
这个类非常灵活,几乎可以在任何需要监控代码运行时间的情况下使用它。但是,在接下来的部分中,云朵君将和大家一起了解如何使用上下文管理器和装饰器,这将更方便地对代码块和函数进行计时。
The above is the detailed content of Teach you step by step how to implement a Python timer. For more information, please follow other related articles on the PHP Chinese website!