How to use weak references in Python-Python Tutorial-php.cn

Table of Contents

Background

Typical usage

实现细节

Home

Backend Development

Python Tutorial

How to use weak references in Python

PHPz

May 12, 2023 pm 11:52 PM

python

Background

Before we start discussing weak references (weakref), let’s first take a look at what is a weak reference? What exactly does it do?

Suppose we have a multi-threaded program that processes application data concurrently:

# 占用大量资源，创建销毁成本很高\
class Data:\
    def __init__(self, key):\
        pass

Copy after login

Application data Data is uniquely identified by a key, and the same data may be accessed by multiple threads at the same time. Since Data requires a lot of system resources, the cost of creation and consumption is high. We hope that Data only maintains one copy in the program, and does not want to create it repeatedly even if it is accessed by multiple threads at the same time.

To this end, we try to design a caching middleware Cacher:

import threading
# 数据缓存
class Cacher:
    def __init__(self):
        self.pool = {}
        self.lock = threading.Lock()
    def get(self, key):
        with self.lock:
            data = self.pool.get(key)
            if data:
                return data
            self.pool[key] = data = Data(key)
            return data

Copy after login

Cacher internally uses a dict object to cache the created Data copy, and provides a get method for obtaining application data Data. When the get method obtains data, it first checks the cache dictionary. If the data already exists, it will be returned directly; if the data does not exist, it will create one and save it in the dictionary. Therefore, the data is entered into the cache dictionary after it is first created. If other threads access it at the same time later, the same copy in the cache will be used.

Feels very good! But the fly in the ointment is: Cacher has the risk of resource leakage!

Because once Data is created, it is stored in the cache dictionary and will never be released! In other words, the program's resources, such as memory, will continue to grow and may eventually explode. Therefore, we hope that a piece of data can be automatically released after all threads no longer access it.

We can maintain the number of data references in Cacher, and the get method automatically accumulates this count. At the same time, a new remove method is provided for releasing data. It first decrements the number of references and deletes the data from the cache field when the number of references drops to zero.

The thread calls the get method to obtain the data. After the data is used up, the remove method needs to be called to release it. Cacher is equivalent to implementing the reference counting method itself, which is too troublesome! Doesn’t Python have a built-in garbage collection mechanism? Why does the application need to implement it itself?

The main crux of the conflict lies in Cacher's cache dictionary: as a middleware, it does not use data objects itself, so theoretically it should not have a reference to the data. Is there any black technology that can find the target object without generating a reference? We know that assignments generate references!

Typical usage

At this time, weak reference (weakref) makes a grand appearance! A weak reference is a special object that can be associated with the target object without generating a reference.

# 创建一个数据
>>> d = Data(&#39;fasionchan.com&#39;)
>>> d
<__main__.Data object at 0x1018571f0>
 
# 创建一个指向该数据的弱引用
>>> import weakref
>>> r = weakref.ref(d)
 
# 调用弱引用对象，即可找到指向的对象
>>> r()
<__main__.Data object at 0x1018571f0>
>>> r() is d
True
 
# 删除临时变量d，Data对象就没有其他引用了，它将被回收
>>> del d
# 再次调用弱引用对象，发现目标Data对象已经不在了（返回None）
>>> r()

Copy after login

How to use weak references in Python

In this way, we only need to change the Cacher cache dictionary to save weak references, and the problem will be solved!

import threading
import weakref
# 数据缓存
class Cacher:
    def __init__(self):
        self.pool = {}
        self.lock = threading.Lock()
    def get(self, key):
        with self.lock:
            r = self.pool.get(key)
            if r:
                data = r()
                if data:
                    return data
            data = Data(key)
            self.pool[key] = weakref.ref(data)
            return data

Copy after login

Since the cache dictionary only saves weak references to Data objects, Cacher will not affect the reference count of Data objects. When all threads have finished using the data, the reference count drops to zero and is released.

In fact, it is very common to use dictionaries to cache data objects. For this reason, the weakref module also provides two dictionary objects that only save weak references:

##weakref. WeakKeyDictionary , the key only saves the mapping class of weak references (once the key no longer has a strong reference, the key-value pair entry will automatically disappear);
weakref.WeakValueDictionary , the value only saves weak references Mapping class (once the value no longer has a strong reference, the key-value pair entry will automatically disappear);

Therefore, our data cache dictionary can be implemented using weakref.WeakValueDictionary, its interface It's exactly the same as a regular dictionary. In this way, we no longer need to maintain weak reference objects by ourselves, and the code logic is more concise and clear: The

import threading
import weakref
# 数据缓存
class Cacher:
    def __init__(self):
        self.pool = weakref.WeakValueDictionary()
        self.lock = threading.Lock()
    def get(self, key):
        with self.lock:
            data = self.pool.get(key)
            if data:
                return data
            self.pool[key] = data = Data(key)
            return data

Copy after login

weakref module also has many useful tool classes and tool functions. Please refer to the official documentation for specific details, which will not be repeated here.

Working Principle

So, what exactly is a weak reference, and why does it have such magical power? Next, let’s take off its veil and see its true appearance!

>>> d = Data(&#39;fasionchan.com&#39;)
 
# weakref.ref 是一个内置类型对象
>>> from weakref import ref
>>> ref
<class &#39;weakref&#39;>
 
# 调用weakref.ref类型对象，创建了一个弱引用实例对象
>>> r = ref(d)
>>> r
<weakref at 0x1008d5b80; to &#39;Data&#39; at 0x100873d60>

Copy after login

After the previous chapters, we are already familiar with reading the source code of built-in objects. The relevant source code files are as follows:

typedef struct _PyWeakReference PyWeakReference;
 
/* PyWeakReference is the base struct for the Python ReferenceType, ProxyType,
 * and CallableProxyType.
 */
#ifndef Py_LIMITED_API
struct _PyWeakReference {
    PyObject_HEAD
 
    /* The object to which this is a weak reference, or Py_None if none.
     * Note that this is a stealth reference:  wr_object&#39;s refcount is
     * not incremented to reflect this pointer.
     */
    PyObject *wr_object;
 
    /* A callable to invoke when wr_object dies, or NULL if none. */
    PyObject *wr_callback;
 
    /* A cache for wr_object&#39;s hash code.  As usual for hashes, this is -1
     * if the hash code isn&#39;t known yet.
     */
    Py_hash_t hash;
 
    /* If wr_object is weakly referenced, wr_object has a doubly-linked NULL-
     * terminated list of weak references to it.  These are the list pointers.
     * If wr_object goes away, wr_object is set to Py_None, and these pointers
     * have no meaning then.
     */
    PyWeakReference *wr_prev;
    PyWeakReference *wr_next;
};
#endif

Copy after login

It can be seen that the PyWeakReference structure is the body of the weak reference object. It is a fixed-length object. In addition to the fixed header, there are 5 fields:

How to use weak references in Python

##wr_object, object pointer, pointing to the referenced object, weak The reference can find the referenced object based on this field, but no reference will be generated;

wr_callback, pointing to a callable object, will be called when the referenced object is destroyed;
hash ，缓存被引用对象的哈希值；
wr_prev 和 wr_next 分别是前后向指针，用于将弱引用对象组织成双向链表；

结合代码中的注释，我们知道：

How to use weak references in Python

弱引用对象通过 wr_object 字段关联被引用的对象，如上图虚线箭头所示；
一个对象可以同时被多个弱引用对象关联，图中的 Data 实例对象被两个弱引用对象关联；
所有关联同一个对象的弱引用，被组织成一个双向链表，链表头保存在被引用对象中，如上图实线箭头所示；
当一个对象被销毁后，Python 将遍历它的弱引用链表，逐一处理：

将 wr_object 字段设为 None ，弱引用对象再被调用将返回 None ，调用者便知道对象已经被销毁了；
执行回调函数 wr_callback （如有）；

由此可见，弱引用的工作原理其实就是设计模式中的 观察者模式（ Observer ）。当对象被销毁，它的所有弱引用对象都得到通知，并被妥善处理。

实现细节

掌握弱引用的基本原理，足以让我们将其用好。如果您对源码感兴趣，还可以再深入研究它的一些实现细节。

前面我们提到，对同一对象的所有弱引用，被组织成一个双向链表，链表头保存在对象中。由于能够创建弱引用的对象类型是多种多样的，很难由一个固定的结构体来表示。因此，Python 在类型对象中提供一个字段 tp_weaklistoffset ，记录弱引用链表头指针在实例对象中的偏移量。

How to use weak references in Python

由此一来，对于任意对象 o ，我们只需通过 ob_type 字段找到它的类型对象 t ，再根据 t 中的 tp_weaklistoffset 字段即可找到对象 o 的弱引用链表头。

Python 在 Include/objimpl.h 头文件中提供了两个宏定义：

/* Test if a type supports weak references */
#define PyType_SUPPORTS_WEAKREFS(t) ((t)->tp_weaklistoffset > 0)
 
#define PyObject_GET_WEAKREFS_LISTPTR(o) \
    ((PyObject **) (((char *) (o)) + Py_TYPE(o)->tp_weaklistoffset))

Copy after login

PyType_SUPPORTS_WEAKREFS 用于判断类型对象是否支持弱引用，仅当 tp_weaklistoffset 大于零才支持弱引用，内置对象 list 等都不支持弱引用；
PyObject_GET_WEAKREFS_LISTPTR 用于取出一个对象的弱引用链表头，它先通过 Py_TYPE 宏找到类型对象 t ，再找通过 tp_weaklistoffset 字段确定偏移量，最后与对象地址相加即可得到链表头字段的地址；

我们创建弱引用时，需要调用弱引用类型对象 weakref 并将被引用对象 d 作为参数传进去。弱引用类型对象 weakref 是所有弱引用实例对象的类型，是一个全局唯一的类型对象，定义在 Objects/weakrefobject.c 中，即：_PyWeakref_RefType（第 350 行）。

How to use weak references in Python

根据对象模型中学到的知识，Python 调用一个对象时，执行的是其类型对象中的 tp_call 函数。因此，调用弱引用类型对象 weakref 时，执行的是 weakref 的类型对象，也就是 type 的 tp_call 函数。tp_call 函数则回过头来调用 weakref 的 tp_new 和 tp_init 函数，其中 tp_new 为实例对象分配内存，而 tp_init 则负责初始化实例对象。

回到 Objects/weakrefobject.c 源文件，可以看到 PyWeakref_RefType 的 tp_new 字段被初始化成 *weakref___new_* （第 276 行）。该函数的主要处理逻辑如下：

解析参数，得到被引用的对象（第 282 行）；
调用 PyType_SUPPORTS_WEAKREFS 宏判断被引用的对象是否支持弱引用，不支持就抛异常（第 286 行）；
调用 GET_WEAKREFS_LISTPTR 行取出对象的弱引用链表头字段，为方便插入返回的是一个二级指针（第 294 行）；
调用 get_basic_refs 取出链表最前那个 callback 为空 基础弱引用对象（如有，第 295 行）；
如果 callback 为空，而且对象存在 callback 为空的基础弱引用，则复用该实例直接将其返回（第 296 行）；
如果不能复用，调用 tp_alloc 函数分配内存、完成字段初始化，并插到对象的弱引用链表（第 309 行）；

If the callback is empty, insert it directly into the front of the linked list to facilitate subsequent reuse (see point 4);
If callback is not empty. Insert it after the basic weak reference object (if any) to ensure that the basic weak reference is at the head of the linked list for easy access;

When an object is recycled , the tp_dealloc function will call the PyObject_ClearWeakRefs function to clean up its weak references. This function takes out the weak reference list of the object, then traverses it one by one, cleans the wr_object field and executes the wr_callback callback function (if any). The specific details will not be expanded. If you are interested, you can check the source code in Objects/weakrefobject.c, located at line 880.

Okay, after studying this section, we have thoroughly mastered the knowledge related to weak references. Weak references can manage the target object without generating a reference count, and are often used in frameworks and middleware. Weak references look magical, but in fact the design principle is a very simple observer pattern. After the weak reference object is created, it is inserted into a linked list maintained by the target object, and the destruction event of the object is observed (subscribed).

The above is the detailed content of How to use weak references in Python. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress images for free

Clothoff.io

AI clothes remover

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)

4 weeks ago By 尊渡假赌尊渡假赌尊渡假赌

R.E.P.O. Best Graphic Settings

4 weeks ago By 尊渡假赌尊渡假赌尊渡假赌

Assassin's Creed Shadows: Seashell Riddle Solution

2 weeks ago By DDD

R.E.P.O. How to Fix Audio if You Can't Hear Anyone

4 weeks ago By 尊渡假赌尊渡假赌尊渡假赌

R.E.P.O. Chat Commands and How to Use Them

4 weeks ago By 尊渡假赌尊渡假赌尊渡假赌

Hot Tools

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Hot Topics

Where is the login entrance for gmail email?

7520

CakePHP Tutorial

1378

What is the format of the account name of steam

win11 activation key permanent

nyt connections hints and answers

Related knowledge

How to use Debian Apache logs to improve website performance Apr 12, 2025 pm 11:36 PM

This article will explain how to improve website performance by analyzing Apache logs under the Debian system. 1. Log Analysis Basics Apache log records the detailed information of all HTTP requests, including IP address, timestamp, request URL, HTTP method and response code. In Debian systems, these logs are usually located in the /var/log/apache2/access.log and /var/log/apache2/error.log directories. Understanding the log structure is the first step in effective analysis. 2. Log analysis tool You can use a variety of tools to analyze Apache logs: Command line tools: grep, awk, sed and other command line tools.

Python: Games, GUIs, and More Apr 13, 2025 am 12:14 AM

Python excels in gaming and GUI development. 1) Game development uses Pygame, providing drawing, audio and other functions, which are suitable for creating 2D games. 2) GUI development can choose Tkinter or PyQt. Tkinter is simple and easy to use, PyQt has rich functions and is suitable for professional development.

PHP and Python: Comparing Two Popular Programming Languages Apr 14, 2025 am 12:13 AM

PHP and Python each have their own advantages, and choose according to project requirements. 1.PHP is suitable for web development, especially for rapid development and maintenance of websites. 2. Python is suitable for data science, machine learning and artificial intelligence, with concise syntax and suitable for beginners.

How debian readdir integrates with other tools Apr 13, 2025 am 09:42 AM

The readdir function in the Debian system is a system call used to read directory contents and is often used in C programming. This article will explain how to integrate readdir with other tools to enhance its functionality. Method 1: Combining C language program and pipeline First, write a C program to call the readdir function and output the result: #include#include#include#includeintmain(intargc,char*argv[]){DIR*dir;structdirent*entry;if(argc!=2){

The role of Debian Sniffer in DDoS attack detection Apr 12, 2025 pm 10:42 PM

This article discusses the DDoS attack detection method. Although no direct application case of "DebianSniffer" was found, the following methods can be used for DDoS attack detection: Effective DDoS attack detection technology: Detection based on traffic analysis: identifying DDoS attacks by monitoring abnormal patterns of network traffic, such as sudden traffic growth, surge in connections on specific ports, etc. This can be achieved using a variety of tools, including but not limited to professional network monitoring systems and custom scripts. For example, Python scripts combined with pyshark and colorama libraries can monitor network traffic in real time and issue alerts. Detection based on statistical analysis: By analyzing statistical characteristics of network traffic, such as data

Python and Time: Making the Most of Your Study Time Apr 14, 2025 am 12:02 AM

To maximize the efficiency of learning Python in a limited time, you can use Python's datetime, time, and schedule modules. 1. The datetime module is used to record and plan learning time. 2. The time module helps to set study and rest time. 3. The schedule module automatically arranges weekly learning tasks.

Nginx SSL Certificate Update Debian Tutorial Apr 13, 2025 am 07:21 AM

This article will guide you on how to update your NginxSSL certificate on your Debian system. Step 1: Install Certbot First, make sure your system has certbot and python3-certbot-nginx packages installed. If not installed, please execute the following command: sudoapt-getupdatesudoapt-getinstallcertbotpython3-certbot-nginx Step 2: Obtain and configure the certificate Use the certbot command to obtain the Let'sEncrypt certificate and configure Nginx: sudocertbot--nginx Follow the prompts to select

How to configure HTTPS server in Debian OpenSSL Apr 13, 2025 am 11:03 AM

Configuring an HTTPS server on a Debian system involves several steps, including installing the necessary software, generating an SSL certificate, and configuring a web server (such as Apache or Nginx) to use an SSL certificate. Here is a basic guide, assuming you are using an ApacheWeb server. 1. Install the necessary software First, make sure your system is up to date and install Apache and OpenSSL: sudoaptupdatesudoaptupgradesudoaptinsta

See all articles