How reliable are large models? IBM and other scholars' latest tutorial on 'Basic Robustness of Basic Models”-AI-php.cn

Home

Technology peripherals

How reliable are large models? IBM and other scholars' latest tutorial on 'Basic Robustness of Basic Models”

王林

Apr 11, 2023 pm 10:43 PM

Model

As one of the most prestigious AI academic conferences in the world, NeurIPS is an important event in the academic community every year. Its full name is Neural Information Processing Systems, which is usually hosted by the NeurIPS Foundation in December every year. .

The content discussed at the conference includes deep learning, computer vision, large-scale machine learning, learning theory, optimization, sparse theory and many other subdivisions.

This year's NeurIPS is the 36th and will be held for two weeks from November 28th to December 9th.

The first week will be an in-person meeting at the Ernest N. Morial Convention Center in New Orleans, USA, and the second week will be an online meeting.

Scholars from IBM Research Center and other scholars talk about the robustness of large models, which is very worthy of attention!

How reliable are large models? IBM and other scholars latest tutorial on Basic Robustness of Basic Models”

The basic model adopts deep learning method, pre-training on large-scale unlabeled data, and fine-tuning through supervision of specific tasks. Become a mainstream technology for machine learning.

Although the base model holds a lot of promise in learning general representations and few/zero-shot generalization across domains and data patterns, it also suffers from the excessive data volume and complexity used. Neural network architectures, they pose unprecedented challenges and considerable risks in terms of robustness and privacy.

This tutorial aims to provide a coursera-like online tutorial, containing comprehensive lectures, a practical and interactive Jupyter/Colab real-time coding demonstration, and a tutorial on trustworthiness in the basic model. Group discussion on different aspects of sexuality.

https://sites.google.com/view/neurips2022-frfm-turotial

Directory content:

Basics in foundation models and robustness
Deep dive on foundation models for computer vision
Deep dive on foundation models for code
Hands-on code walkthrough
Concluding Remarks
Q&A
Panel discussion

Speaker:

How reliable are large models? IBM and other scholars latest tutorial on Basic Robustness of Basic Models”

Real-world machine learning systems need to be robust to distribution changes - they should work well on test distributions that are different from the training distribution.

Such as poverty maps in under-resourced countries [Xie et al. 2016; Jean et al. 2016], self-driving cars [Yu et al. 2020a; Sun et al. 2020a], High-risk applications such as medical diagnosis [AlBadawy et al. 2018; Dai and Gool 2018] require the model to generalize well to environments not seen in the training data. For example, test samples come from different countries and are in different environments. Driving conditions, or from different hospitals.

Previous work has shown that these distribution changes can lead to large performance degradation even for current state-of-the-art models [Blitzer et al. 2006; Daumé III 2007; Sugiyama et al. al. 2007; Ganin and Lempitsky 2015; Peng et al. 2019; Kumar et al. 2020a; Arjovsky et al. 2019; Szegedy et al. 2014; Hendrycks and Dietterich 2019; Sagawa et al. 2020a; Recht et al. 2019; Abney 2007; Ruder and Plank 2018; Geirhos et al. 2018; Kumar et al. 2020b; Yu et al. 2020b; Geirhos et al. 2020; Xie et al. 2021a; Koh et al. 2021].

A base model is trained on a large and diverse unlabeled data set sampled from the distribution How reliable are large models? IBM and other scholars latest tutorial on Basic Robustness of Basic Models” , and can then be adapted to many downstream tasks.

For each downstream task How reliable are large models? IBM and other scholars latest tutorial on Basic Robustness of Basic Models” , the base model is within a labeled distribution sampled from the training distribution Train on the (in-distribution, ID) training data, and then evaluate on the out-of-distribution (OOD) test distribution How reliable are large models? IBM and other scholars latest tutorial on Basic Robustness of Basic Models” .

For example, a poverty map prediction model [Xie et al. 2016; Jean et al. 2016] can learn useful features for all countries in unlabeled satellite data around the world, and then Fine-tuning is performed on labeled examples from Nigeria and finally evaluated on Malawi where labeled examples are lacking.

We believe that: 1) The base model is a particularly promising approach in terms of robustness. Existing work shows that pretraining on unlabeled data is an effective and general method to improve accuracy on the OOD test distribution, in contrast to many robustness interventions that are limited to limited distribution changes.

However, we also discussed 2) why the underlying model may not always cope with distribution changes, such as some due to spurious correlations or distribution changes over time.

Finally, 3) we outline several research directions that exploit and improve the robustness of the underlying model.

We note that one way for the base model to improve the performance of downstream tasks is to provide inductive biases (through model initialization) for the adapted model, which are outside the downstream training data. learned on a variety of data sets.

However, the same inductive bias can also encode harmful associations from pre-training data and lead to representation and assignment harms in the presence of distribution changes.

How reliable are large models? IBM and other scholars latest tutorial on Basic Robustness of Basic Models”

How reliable are large models? IBM and other scholars latest tutorial on Basic Robustness of Basic Models” #

How reliable are large models? IBM and other scholars latest tutorial on Basic Robustness of Basic Models”

The above is the detailed content of How reliable are large models? IBM and other scholars' latest tutorial on 'Basic Robustness of Basic Models”. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress images for free

Clothoff.io

AI clothes remover

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Assassin's Creed Shadows: Seashell Riddle Solution

4 weeks ago By DDD

What's New in Windows 11 KB5054979 & How to Fix Update Issues

3 weeks ago By DDD

Where to find the Crane Control Keycard in Atomfall

4 weeks ago By DDD

How to fix KB5055523 fails to install in Windows 11?

2 weeks ago By DDD

InZoi: How To Apply To School And University

3 weeks ago By DDD

Hot Tools

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Hot Topics

Where is the login entrance for gmail email?

7747

Java Tutorial

1643

CakePHP Tutorial

1397

Laravel Tutorial

1291

PHP Tutorial

1234

Related knowledge

The world's most powerful open source MoE model is here, with Chinese capabilities comparable to GPT-4, and the price is only nearly one percent of GPT-4-Turbo May 07, 2024 pm 04:13 PM

Imagine an artificial intelligence model that not only has the ability to surpass traditional computing, but also achieves more efficient performance at a lower cost. This is not science fiction, DeepSeek-V2[1], the world’s most powerful open source MoE model is here. DeepSeek-V2 is a powerful mixture of experts (MoE) language model with the characteristics of economical training and efficient inference. It consists of 236B parameters, 21B of which are used to activate each marker. Compared with DeepSeek67B, DeepSeek-V2 has stronger performance, while saving 42.5% of training costs, reducing KV cache by 93.3%, and increasing the maximum generation throughput to 5.76 times. DeepSeek is a company exploring general artificial intelligence

AI subverts mathematical research! Fields Medal winner and Chinese-American mathematician led 11 top-ranked papers | Liked by Terence Tao Apr 09, 2024 am 11:52 AM

AI is indeed changing mathematics. Recently, Tao Zhexuan, who has been paying close attention to this issue, forwarded the latest issue of "Bulletin of the American Mathematical Society" (Bulletin of the American Mathematical Society). Focusing on the topic "Will machines change mathematics?", many mathematicians expressed their opinions. The whole process was full of sparks, hardcore and exciting. The author has a strong lineup, including Fields Medal winner Akshay Venkatesh, Chinese mathematician Zheng Lejun, NYU computer scientist Ernest Davis and many other well-known scholars in the industry. The world of AI has changed dramatically. You know, many of these articles were submitted a year ago.

Google is ecstatic: JAX performance surpasses Pytorch and TensorFlow! It may become the fastest choice for GPU inference training Apr 01, 2024 pm 07:46 PM

The performance of JAX, promoted by Google, has surpassed that of Pytorch and TensorFlow in recent benchmark tests, ranking first in 7 indicators. And the test was not done on the TPU with the best JAX performance. Although among developers, Pytorch is still more popular than Tensorflow. But in the future, perhaps more large models will be trained and run based on the JAX platform. Models Recently, the Keras team benchmarked three backends (TensorFlow, JAX, PyTorch) with the native PyTorch implementation and Keras2 with TensorFlow. First, they select a set of mainstream

Hello, electric Atlas! Boston Dynamics robot comes back to life, 180-degree weird moves scare Musk Apr 18, 2024 pm 07:58 PM

Boston Dynamics Atlas officially enters the era of electric robots! Yesterday, the hydraulic Atlas just "tearfully" withdrew from the stage of history. Today, Boston Dynamics announced that the electric Atlas is on the job. It seems that in the field of commercial humanoid robots, Boston Dynamics is determined to compete with Tesla. After the new video was released, it had already been viewed by more than one million people in just ten hours. The old people leave and new roles appear. This is a historical necessity. There is no doubt that this year is the explosive year of humanoid robots. Netizens commented: The advancement of robots has made this year's opening ceremony look like a human, and the degree of freedom is far greater than that of humans. But is this really not a horror movie? At the beginning of the video, Atlas is lying calmly on the ground, seemingly on his back. What follows is jaw-dropping

KAN, which replaces MLP, has been extended to convolution by open source projects Jun 01, 2024 pm 10:03 PM

Earlier this month, researchers from MIT and other institutions proposed a very promising alternative to MLP - KAN. KAN outperforms MLP in terms of accuracy and interpretability. And it can outperform MLP running with a larger number of parameters with a very small number of parameters. For example, the authors stated that they used KAN to reproduce DeepMind's results with a smaller network and a higher degree of automation. Specifically, DeepMind's MLP has about 300,000 parameters, while KAN only has about 200 parameters. KAN has a strong mathematical foundation like MLP. MLP is based on the universal approximation theorem, while KAN is based on the Kolmogorov-Arnold representation theorem. As shown in the figure below, KAN has

Time Series Forecasting NLP Large Model New Work: Automatically Generate Implicit Prompts for Time Series Forecasting Mar 18, 2024 am 09:20 AM

Today I would like to share a recent research work from the University of Connecticut that proposes a method to align time series data with large natural language processing (NLP) models on the latent space to improve the performance of time series forecasting. The key to this method is to use latent spatial hints (prompts) to enhance the accuracy of time series predictions. Paper title: S2IP-LLM: SemanticSpaceInformedPromptLearningwithLLMforTimeSeriesForecasting Download address: https://arxiv.org/pdf/2403.05798v1.pdf 1. Large problem background model

Tesla robots work in factories, Musk: The degree of freedom of hands will reach 22 this year! May 06, 2024 pm 04:13 PM

The latest video of Tesla's robot Optimus is released, and it can already work in the factory. At normal speed, it sorts batteries (Tesla's 4680 batteries) like this: The official also released what it looks like at 20x speed - on a small "workstation", picking and picking and picking: This time it is released One of the highlights of the video is that Optimus completes this work in the factory, completely autonomously, without human intervention throughout the process. And from the perspective of Optimus, it can also pick up and place the crooked battery, focusing on automatic error correction: Regarding Optimus's hand, NVIDIA scientist Jim Fan gave a high evaluation: Optimus's hand is the world's five-fingered robot. One of the most dexterous. Its hands are not only tactile

FisheyeDetNet: the first target detection algorithm based on fisheye camera Apr 26, 2024 am 11:37 AM

Target detection is a relatively mature problem in autonomous driving systems, among which pedestrian detection is one of the earliest algorithms to be deployed. Very comprehensive research has been carried out in most papers. However, distance perception using fisheye cameras for surround view is relatively less studied. Due to large radial distortion, standard bounding box representation is difficult to implement in fisheye cameras. To alleviate the above description, we explore extended bounding box, ellipse, and general polygon designs into polar/angular representations and define an instance segmentation mIOU metric to analyze these representations. The proposed model fisheyeDetNet with polygonal shape outperforms other models and simultaneously achieves 49.5% mAP on the Valeo fisheye camera dataset for autonomous driving

See all articles