


2022 Top10 self-supervised learning models released! Eight achievements of the United States and China dominate the list
Self-supervised learning enables computers to observe the world and understand it by learning the structure of images, speech, or text. This has driven many of the recent major advances in artificial intelligence.
Despite the considerable efforts that researchers around the world have invested in this area, there are currently large differences in the way self-supervised learning algorithms learn from images, speech, text and other modalities. Therefore, the artificial intelligence forum Analytics India Magazine launches the top ten self-supervised learning models in 2022 for the readers.
Data2vec
Paper link: https://arxiv.org/pdf/2202.03555.pdf
Open source code: https://t.co/3x8VCwGI2x pic.twitter.com/Q9TNDg1paj
Meta AI released the data2vec algorithm in January for speech, image and text related computer vision model. According to the AI team, the model is highly competitive in NLP tasks.
It does not use contrastive learning or reconstruction that relies on input examples. The Meta AI team stated that the training method of data2vec is to represent the predictive model by providing a partial view of the input data.
The team said: "We first encode the masked training samples in the student model. After that, in the same model, we encode the unmasked input samples to build the training target. This model (teacher model) and the student model only differ in parameters."
This model predicts the model representation of unmasked training samples based on the masked training samples. This eliminates the dependence on modality-specific objectives in the learning task.
ConvNext
Paper link: https://arxiv.org/pdf/2201.03545.pdf
Open source code: https://t.co/nWx2KFtl7X
ConvNext, also called ConvNet model for the 2020s, is a model released by the Meta AI team in March Model. It is entirely based on ConvNet's modules and is therefore accurate, simple in design, and scalable.
##Paper link: https:// t.co/H7crDPHCHV
Open source code: https://t.co/oadSBT61P3
Variance-invariant covariance regularization (VICReg) combines the variance terms and Decorrelation mechanism based on redundancy reduction and covariance regularization to avoid the collapse of the encoder producing constant or uninformative vectors.
VICReg does not require techniques such as weight sharing between branches, batch normalization, feature normalization, output quantization, stopping gradients, memory banks, etc., and performs well on several downstream tasks The results achieved are comparable to the state of the art. Furthermore, it has been experimentally demonstrated that the variance regularization term can stabilize the training of other methods and promote performance improvements.
STEGO
Paper link: https://arxiv.org/abs/2203.08414
MIT’s Computer Science and Artificial Intelligence Laboratory collaborated with Microsoft and Cornell University to develop the Self-supervised Transformer for Energy-Based Graph Optimization (STEGO) to solve one of the most difficult tasks in computer vision. : Assign a label to every pixel of an image without human supervision.
#STEGO learned "semantic segmentation" - simply put, assigning a label to each pixel in the image.
Semantic segmentation is an important skill for today's computer vision systems because images may be interfered by objects. To make matters more difficult, these objects don't always fit within the text box. Algorithms are often better suited to discrete “things” like people and cars than to hard-to-quantify things like vegetation, the sky, and mashed potatoes.
Take the scene of dogs playing in the park as an example. Previous systems may only be able to identify dogs, but by assigning a label to each pixel of the image, STEGO can decompose the image into several main components: Dog , sky, grass and its owner.
Machines that can "see the world" are crucial to a variety of emerging technologies, such as self-driving cars and predictive models for medical diagnosis. Since STEGO can learn without labels, it can detect objects in different domains, even objects that humans do not yet fully understand.
CoBERT
Paper link: https://arxiv.org/pdf/2210.04062.pdf
For self-supervised speech representation learning, researchers from the Chinese University of Hong Kong (Shenzhen) proposed Code BERT (CoBERT). Unlike other self-distillation methods, their model predicts representations from different modalities. The model converts speech into a sequence of discrete codes for representation learning.
First, the research team used the HuBERT pre-trained code model to train in discrete space. They then refined the code model into a speech model, aiming to perform better learning across modalities. The significant improvement on the ST task suggests that CoBERT's representations may carry more linguistic information than previous work.
CoBERT outperforms the performance of the best current algorithms on ASR tasks and brings significant improvements in the SUPERB Speech Translation (ST) task.
FedX
##Paper link: https://arxiv.org/pdf/ 2202.00758.pdf
Researchers at Nokia Bell Labs, in collaboration with Georgia Institute of Technology and the University of Cambridge, have developed ColloSSL, a collaborative self-supervised algorithm for human activity recognition.
Unlabeled sensor data sets captured simultaneously by multiple devices can be viewed as natural transformations of each other, which then generate signals for representation learning. This paper proposes three methods - device selection, contrastive sampling and multi-view contrastive loss.
LoRot
Paper link: https://arxiv.org/pdf/2207.10023.pdf
Sungkyunkwan A university research team proposes a simple self-supervised auxiliary task that predicts localizable rotations (LoRot) with three attributes to assist in supervising the target.
This model has three major characteristics. First, the research team guided the model to learn rich features. Second, distributed training does not change significantly while the self-supervision transition occurs. Third, the model is lightweight and versatile and has high adaptability to previous technologies.
TS2Vec
The above is the detailed content of 2022 Top10 self-supervised learning models released! Eight achievements of the United States and China dominate the list. For more information, please follow other related articles on the PHP Chinese website!

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

AI Hentai Generator
Generate AI Hentai for free.

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics

The 2024CSRankings National Computer Science Major Rankings have just been released! This year, in the ranking of the best CS universities in the United States, Carnegie Mellon University (CMU) ranks among the best in the country and in the field of CS, while the University of Illinois at Urbana-Champaign (UIUC) has been ranked second for six consecutive years. Georgia Tech ranked third. Then, Stanford University, University of California at San Diego, University of Michigan, and University of Washington tied for fourth place in the world. It is worth noting that MIT's ranking fell and fell out of the top five. CSRankings is a global university ranking project in the field of computer science initiated by Professor Emery Berger of the School of Computer and Information Sciences at the University of Massachusetts Amherst. The ranking is based on objective

Windows Remote Desktop Service allows users to access computers remotely, which is very convenient for people who need to work remotely. However, problems can be encountered when users cannot connect to the remote computer or when Remote Desktop cannot authenticate the computer's identity. This may be caused by network connection issues or certificate verification failure. In this case, the user may need to check the network connection, ensure that the remote computer is online, and try to reconnect. Also, ensuring that the remote computer's authentication options are configured correctly is key to resolving the issue. Such problems with Windows Remote Desktop Services can usually be resolved by carefully checking and adjusting settings. Remote Desktop cannot verify the identity of the remote computer due to a time or date difference. Please make sure your calculations

Written above & the author’s personal understanding: At present, in the entire autonomous driving system, the perception module plays a vital role. The autonomous vehicle driving on the road can only obtain accurate perception results through the perception module. The downstream regulation and control module in the autonomous driving system makes timely and correct judgments and behavioral decisions. Currently, cars with autonomous driving functions are usually equipped with a variety of data information sensors including surround-view camera sensors, lidar sensors, and millimeter-wave radar sensors to collect information in different modalities to achieve accurate perception tasks. The BEV perception algorithm based on pure vision is favored by the industry because of its low hardware cost and easy deployment, and its output results can be easily applied to various downstream tasks.

Occasionally, the operating system may malfunction when using a computer. The problem I encountered today was that when accessing gpedit.msc, the system prompted that the Group Policy object could not be opened because the correct permissions may be lacking. The Group Policy object on this computer could not be opened. Solution: 1. When accessing gpedit.msc, the system prompts that the Group Policy object on this computer cannot be opened because of lack of permissions. Details: The system cannot locate the path specified. 2. After the user clicks the close button, the following error window pops up. 3. Check the log records immediately and combine the recorded information to find that the problem lies in the C:\Windows\System32\GroupPolicy\Machine\registry.pol file

Common challenges faced by machine learning algorithms in C++ include memory management, multi-threading, performance optimization, and maintainability. Solutions include using smart pointers, modern threading libraries, SIMD instructions and third-party libraries, as well as following coding style guidelines and using automation tools. Practical cases show how to use the Eigen library to implement linear regression algorithms, effectively manage memory and use high-performance matrix operations.

The bottom layer of the C++sort function uses merge sort, its complexity is O(nlogn), and provides different sorting algorithm choices, including quick sort, heap sort and stable sort.

If you have problems copying data from a remote desktop to your local computer, this article can help you resolve it. Remote desktop technology allows multiple users to access virtual desktops on a central server, providing data protection and application management. This helps ensure data security and enables companies to manage their applications more efficiently. Users may face challenges while using Remote Desktop, one of which is the inability to copy data from the Remote Desktop to the local computer. This may be caused by different factors. Therefore, this article will provide guidance on resolving this issue. Why can't I copy from the remote desktop to my local computer? When you copy a file on your computer, it is temporarily stored in a location called the clipboard. If you cannot use this method to copy data from the remote desktop to your local computer

01 Outlook Summary Currently, it is difficult to achieve an appropriate balance between detection efficiency and detection results. We have developed an enhanced YOLOv5 algorithm for target detection in high-resolution optical remote sensing images, using multi-layer feature pyramids, multi-detection head strategies and hybrid attention modules to improve the effect of the target detection network in optical remote sensing images. According to the SIMD data set, the mAP of the new algorithm is 2.2% better than YOLOv5 and 8.48% better than YOLOX, achieving a better balance between detection results and speed. 02 Background & Motivation With the rapid development of remote sensing technology, high-resolution optical remote sensing images have been used to describe many objects on the earth’s surface, including aircraft, cars, buildings, etc. Object detection in the interpretation of remote sensing images
