New SOTA for target detection: YOLOv9 comes out, and the new architecture brings traditional convolution back to life-AI-php.cn

Home

Technology peripherals

New SOTA for target detection: YOLOv9 comes out, and the new architecture brings traditional convolution back to life

PHPz

Feb 23, 2024 pm 12:49 PM

Target Detection industry data lost yolov9

In the field of target detection, YOLOv9 continues to make progress during the implementation process. By adopting new architecture and methods, it effectively improves the parameter utilization of traditional convolution, which makes its performance far superior to previous generation products.

More than a year after YOLOv8 was officially released in January 2023, YOLOv9 is finally here!

Since Joseph Redmon, Ali Farhadi and others proposed the first-generation YOLO model in 2015, researchers in the field of target detection have updated and iterated it many times. YOLO is a prediction system based on global information of images, and its model performance is continuously enhanced. By continuously improving algorithms and technologies, researchers have achieved remarkable results, making YOLO more and more powerful in target detection tasks. These continuous improvements and optimizations have brought new opportunities and challenges to the development of target detection technology, while also promoting progress and innovation in this field. The success of YOLO has also inspired researchers to continue their efforts.

This time, YOLOv9 was jointly developed by Academia Sinica, Taipei University of Technology, Taiwan, and other institutions. The related paper "Learning What You Want to Learn Using Programmable Gradient Information" 》has been released.

New SOTA for target detection: YOLOv9 comes out, and the new architecture brings traditional convolution back to life

Paper address: https://arxiv.org/pdf/2402.13616.pdf

GitHub address: https://github.com/WongKinYiu/ yolov9

Today’s deep learning methods focus on how to design the most appropriate objective function so that the model’s prediction results can be closest to the real situation. At the same time, an appropriate architecture must be designed that can help obtain sufficient information for prediction. However, existing methods ignore the fact that a large amount of information will be lost when the input data undergoes layer-by-layer feature extraction and spatial transformation.

Therefore, YOLOv9 deeply studies the important issues of data loss when data is transmitted through deep networks, namely information bottlenecks and reversible functions.

Researchers proposed the concept of programmable gradient information (PGI) to cope with the various changes required by deep networks to achieve multiple goals. PGI can provide complete input information for the target task to calculate the objective function, thereby obtaining reliable gradient information to update network weights.

In addition, researchers designed a new lightweight network architecture based on gradient path planning, namely Generalized Efficient Layer Aggregation Network (GELAN). This architecture confirms that PGI can achieve excellent results on lightweight models.

The researchers verified the proposed GELAN and PGI on the target detection task based on the MS COCO data set. Results show that GELAN achieves better parameter utilization using only traditional convolution operators compared to SOTA methods developed based on deep convolutions.

For PGI, it is very adaptable and can be used on various models from light to large. We can use this to obtain complete information, thereby enabling a model trained from scratch to achieve better results than a SOTA model pre-trained using a large dataset. Figure 1 below shows some comparison results.

For the newly released YOLOv9, Alexey Bochkovskiy, who has participated in the development of YOLOv7, YOLOv4, Scaled-YOLOv4 and DPT, spoke highly of it, saying that YOLOv9 is better than any convolution-based or transformer's object detector.

## Source: https://twitter.com/alexeyab84/status/1760685626247250342

and Netizens said that YOLOv9 looks like the new SOTA real-time target detector, and his own custom training tutorial is also on the way.

^{#More "hard-working" netizens have added pip support to the YOLOv9 model.}

## Source: https://twitter.com/kadirnar_ai/status/1760716187896283635

Let’s look at YOLOv9 next details.

Problem Statement
Usually, people attribute the convergence difficulty problem of deep
Neural Network to factors such as gradient disappearance or gradient saturation. These phenomena are indeed Exists in traditional deep neural networks. However, modern deep neural networks have fundamentally solved the above problems by designing various normalization and activation functions. But even so, there are still problems with slow convergence speed or poor convergence effect in deep neural networks. So what is the essence of this problem? Through in-depth analysis of the information bottleneck, the researchers deduced the root cause of the problem: soon after the gradient is initially passed out from the very deep network, much of the information needed to achieve the goal is lost. To verify this inference, the researchers performed feedforward processing on deep networks of different architectures with initial weights. Figure 2 illustrates this visually. Clearly, PlainNet loses a lot of important information required for object detection at deep layers. As for the proportion of important information that ResNet, CSPNet and GELAN can retain, it is indeed positively related to the accuracy that can be obtained after training. The researchers further designed a method based on reversible networks to solve the causes of the above problems.

Method Introduction

Programmable Gradient Information (PGI)

This study proposes a new auxiliary supervision framework : Programmable Gradient Information (PGI), as shown in Figure 3(d).
PGI mainly includes three parts, namely (1) main branch, (2) auxiliary reversible branch, (3) multi-level auxiliary information.

PGI’s inference process only uses the main branch, so there is no additional reasoning cost;

The auxiliary reversible branch is for processing neural networks Problems caused by deepening, network deepening will cause information bottlenecks, causing the loss function to be unable to generate reliable gradients;

Multi-level auxiliary information is designed to deal with the error accumulation problem caused by deep supervision, Especially architectures and lightweight models with multiple prediction branches.

GELAN Network

In addition, the study also proposes a new network architecture GELAN (as shown in the figure below). Specifically, The researchers combined the two neural network architectures of CSPNet and ELAN to design a generalized efficient layer aggregation network (GELAN) that takes into account lightweight, reasoning speed and accuracy. The researchers generalized the capabilities of ELAN, which initially only used stacks of convolutional layers, to a new architecture that can use any computational block.

Experimental results
To evaluate the performance of YOLOv9, the study first compared YOLOv9 with other real-time object detectors trained from scratch A comprehensive comparison was conducted and the results are shown in Table 1 below.

The study also included the ImageNet pre-trained model in the comparison, and the results are shown in Figure 5 below. It is worth noting that YOLOv9 using traditional convolution is even better than YOLO MS using deep convolution in parameter utilization.

Ablation Experiment
In order to explore the role of each component in YOLOv9, this study conducted a series of ablation experiments.
The study first conducted an ablation experiment on GELAN’s computing block. As shown in Table 2 below, the study found that by replacing the convolutional layers in ELAN with different computational blocks, the system maintained good performance.

The study then conducted ablation experiments on GELANs of different sizes for ELAN block depth and CSP block depth, and the results are shown in Table 3 below.
In terms of PGI, researchers conducted ablation studies on auxiliary reversible branches and multi-level auxiliary information on the backbone network and neck respectively. Table 4 lists the results of all experiments. As can be seen from Table 4, PFH is only effective for deep models, while the PGI proposed in this paper can improve accuracy under different combinations.

The researchers further implemented PGI and depth monitoring on models of different sizes and compared the results. The results are shown in Table 5.

Figure 6 shows the results of incrementally adding components from baseline YOLOv7 to YOLOv9-E.

Visualization

The researchers explored the information bottleneck problem and visualized it. Figure 6 shows the Visualization results of feature maps obtained under the architecture using random initial weights as feedforward.

Figure 7 illustrates whether PGI can provide more reliable gradients during training, so that the parameters used for updating effectively capture the relationship between the input data and the target.

For more technical details, please read the original article.

The above is the detailed content of New SOTA for target detection: YOLOv9 comes out, and the new architecture brings traditional convolution back to life. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress images for free

Clothoff.io

AI clothes remover

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)

4 weeks ago By 尊渡假赌尊渡假赌尊渡假赌

R.E.P.O. Best Graphic Settings

4 weeks ago By 尊渡假赌尊渡假赌尊渡假赌

Assassin's Creed Shadows: Seashell Riddle Solution

2 weeks ago By DDD

R.E.P.O. How to Fix Audio if You Can't Hear Anyone

4 weeks ago By 尊渡假赌尊渡假赌尊渡假赌

R.E.P.O. Chat Commands and How to Use Them

4 weeks ago By 尊渡假赌尊渡假赌尊渡假赌

Hot Tools

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Hot Topics

Where is the login entrance for gmail email?

7519

CakePHP Tutorial

1378

What is the format of the account name of steam

win11 activation key permanent

nyt connections hints and answers

Related knowledge

How to use sql datetime Apr 09, 2025 pm 06:09 PM

The DATETIME data type is used to store high-precision date and time information, ranging from 0001-01-01 00:00:00 to 9999-12-31 23:59:59.99999999, and the syntax is DATETIME(precision), where precision specifies the accuracy after the decimal point (0-7), and the default is 3. It supports sorting, calculation, and time zone conversion functions, but needs to be aware of potential issues when converting precision, range and time zones.

How to add columns in PostgreSQL? Apr 09, 2025 pm 12:36 PM

PostgreSQL The method to add columns is to use the ALTER TABLE command and consider the following details: Data type: Select the type that is suitable for the new column to store data, such as INT or VARCHAR. Default: Specify the default value of the new column through the DEFAULT keyword, avoiding the value of NULL. Constraints: Add NOT NULL, UNIQUE, or CHECK constraints as needed. Concurrent operations: Use transactions or other concurrency control mechanisms to handle lock conflicts when adding columns.

Can I retrieve the database password in Navicat? Apr 08, 2025 pm 09:51 PM

Navicat itself does not store the database password, and can only retrieve the encrypted password. Solution: 1. Check the password manager; 2. Check Navicat's "Remember Password" function; 3. Reset the database password; 4. Contact the database administrator.

How to delete rows that meet certain criteria in SQL Apr 09, 2025 pm 12:24 PM

Use the DELETE statement to delete data from the database and specify the deletion criteria through the WHERE clause. Example syntax: DELETE FROM table_name WHERE condition; Note: Back up data before performing a DELETE operation, verify statements in the test environment, use the LIMIT clause to limit the number of deleted rows, carefully check the WHERE clause to avoid misdeletion, and use indexes to optimize the deletion efficiency of large tables.

How to recover data after SQL deletes rows Apr 09, 2025 pm 12:21 PM

Recovering deleted rows directly from the database is usually impossible unless there is a backup or transaction rollback mechanism. Key point: Transaction rollback: Execute ROLLBACK before the transaction is committed to recover data. Backup: Regular backup of the database can be used to quickly restore data. Database snapshot: You can create a read-only copy of the database and restore the data after the data is deleted accidentally. Use DELETE statement with caution: Check the conditions carefully to avoid accidentally deleting data. Use the WHERE clause: explicitly specify the data to be deleted. Use the test environment: Test before performing a DELETE operation.

Navicat's method to view PostgreSQL database password Apr 08, 2025 pm 09:57 PM

It is impossible to view PostgreSQL passwords directly from Navicat, because Navicat stores passwords encrypted for security reasons. To confirm the password, try to connect to the database; to modify the password, please use the graphical interface of psql or Navicat; for other purposes, you need to configure connection parameters in the code to avoid hard-coded passwords. To enhance security, it is recommended to use strong passwords, periodic modifications and enable multi-factor authentication.

How to clean all data with redis Apr 10, 2025 pm 05:06 PM

How to clean all Redis data: Redis 2.8 and later: The FLUSHALL command deletes all key-value pairs. Redis 2.6 and earlier: Use the DEL command to delete keys one by one or use the Redis client to delete methods. Alternative: Restart the Redis service (use with caution), or use the Redis client (such as flushall() or flushdb()).

How to create oracle database How to create oracle database Apr 11, 2025 pm 02:36 PM

To create an Oracle database, the common method is to use the dbca graphical tool. The steps are as follows: 1. Use the dbca tool to set the dbName to specify the database name; 2. Set sysPassword and systemPassword to strong passwords; 3. Set characterSet and nationalCharacterSet to AL32UTF8; 4. Set memorySize and tablespaceSize to adjust according to actual needs; 5. Specify the logFile path. Advanced methods are created manually using SQL commands, but are more complex and prone to errors. Pay attention to password strength, character set selection, tablespace size and memory

See all articles