Table of Contents
1. Large model time series prediction methods
2. Applying NLP large models to time series
3. Time series large model
4. Summary
Home Technology peripherals AI An article on time series forecasting under the wave of large-scale models

An article on time series forecasting under the wave of large-scale models

Nov 06, 2023 am 08:13 AM
field Model nlp

Today I will talk to you about the application of large models in time series forecasting. With the development of large models in the field of NLP, more and more work attempts to apply large models to the field of time series prediction. This article introduces the main methods of applying large models to time series forecasting, and summarizes some recent related work to help everyone understand the research methods of time series forecasting in the era of large models.

1. Large model time series prediction methods

In the past three months, a lot of large model time series prediction work has emerged, which can basically be divided into two types.

Rewritten content: One method is to directly use large-scale models of NLP for time series prediction. In this method, large-scale NLP models such as GPT and Llama are used for time series prediction. The key lies in how to convert time series data into data suitable for large-scale model input.

The second one is in the field of training time series Large model. In this type of method, a large number of time series data sets are used to jointly train a large model such as GPT or Llama in the time series field and use it for downstream time series tasks.

In view of the above two types of methods, here are some related classic large model time series representation works.

2. Applying NLP large models to time series

This method is the earliest batch of large model time series prediction work

New York University and Carnegie Mellon In the paper "Large-scale Language Model as a Zero-Sample Time Series Predictor" co-published by the university, the digital representation of the time series is designed to be tokenized in order to convert it into input that can be recognized by large models such as GPT and LLaMa. Since different large-scale models tokenize numbers differently, personalization is required when using different models. For example, GPT will split a string of numbers into different subsequences, which will affect the learning of the model. Therefore, this article forces a space between numbers to accommodate GPT's input format. For recently released large models such as LLaMa, individual numbers are generally divided, so there is no need to add spaces. At the same time, in order to avoid the input sequence being too long due to too large time series values, some scaling operations are performed in this article to limit the values ​​of the original time series to a more reasonable range

An article on time series forecasting under the wave of large-scale modelsPicture

The above-processed digital string is input into the large model, allowing the large model to autoregressively predict the next number, and finally convert the predicted number into the corresponding time series value. The figure below gives a schematic diagram. The conditional probability of language model is used to model numbers. It is to predict the probability that the next digit will be each number based on the previous numbers. It is an iterative hierarchical softmax structure, coupled with the representation ability of the large model. , can adapt to a variety of distribution types, which is why large models can be used for time series prediction in this way. At the same time, the probability of the next number predicted by the model can also be converted into a prediction of uncertainty to achieve uncertainty estimation of time series.

An article on time series forecasting under the wave of large-scale modelsPicture

In another article titled "TIME-LLM: TIME SERIES FORECASTING BY REPROGRAMMING LARGE LANGUAGE MODELS", the author proposed a A reprogramming method that converts time series into text to achieve alignment between the two forms of time series and text

The specific implementation method is to first divide the time series into multiple patches, and each patch Get an embedding through MLP. Then, the patch embedding is mapped to the word vector in the language model to achieve mapping and cross-modal alignment of time series segments and text. The article proposes a text prototype idea, which maps multiple words to a prototype to represent the semantics of a sequence of patches over a period of time. For example, in the example below, the words shot and up are mapped to red triangles, which correspond to patches of short-term rising subsequences in the time series.

An article on time series forecasting under the wave of large-scale modelsPicture

3. Time series large model

Another research direction is to refer to the large model construction method in the field of natural language processing. Directly build a large model for time series forecasting

Lag-Llama: Towards Foundation Models for Time Series ForecastingThis article builds the Llama model in time series. The core includes design at the feature level and model structure level.

In terms of features, the article extracts multi-scale and multi-type lag features, which are mainly historical sequence statistical values ​​of different time windows of the original time series. These sequences are input into the model as additional features. In terms of model structure, the core of the LlaMA structure in NLP is Transformer, in which the normalization method and position encoding part have been optimized. The final output layer uses multiple heads to fit the parameters of the probability distribution. For example, the Gaussian distribution fits the mean variance. The student-t distribution is used in this article, and the three corresponding parameters of freedom, mean, and scale are output, and finally each time is obtained. The predicted probability distribution result of the point.

An article on time series forecasting under the wave of large-scale modelsPicture

Another similar work is TimeGPT-1, which builds a GPT model in the time series field. In terms of data training, TimeGPT uses a large amount of time series data, reaching a total of 10 billion data sample points, involving various types of domain data. During training, larger batch sizes and smaller learning rates are used to improve training robustness. The main structure of the model is the classic GPT model

An article on time series forecasting under the wave of large-scale modelsPicture

It can also be seen from the following experimental results that in some zero-shot learning tasks, this The time series pre-trained large model has achieved significant improvement compared to the basic model.

An article on time series forecasting under the wave of large-scale modelsPicture

4. Summary

This article introduces the research ideas of time series forecasting under the wave of large models, including direct Use NLP large models to make time series predictions and train large models in the time series field. No matter which method is used, it shows us the potential of large model time series and is a direction worthy of in-depth study.

The above is the detailed content of An article on time series forecasting under the wave of large-scale models. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

AI Hentai Generator

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)
4 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. Best Graphic Settings
4 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. How to Fix Audio if You Can't Hear Anyone
4 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. Chat Commands and How to Use Them
4 weeks ago By 尊渡假赌尊渡假赌尊渡假赌

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

The world's most powerful open source MoE model is here, with Chinese capabilities comparable to GPT-4, and the price is only nearly one percent of GPT-4-Turbo The world's most powerful open source MoE model is here, with Chinese capabilities comparable to GPT-4, and the price is only nearly one percent of GPT-4-Turbo May 07, 2024 pm 04:13 PM

Imagine an artificial intelligence model that not only has the ability to surpass traditional computing, but also achieves more efficient performance at a lower cost. This is not science fiction, DeepSeek-V2[1], the world’s most powerful open source MoE model is here. DeepSeek-V2 is a powerful mixture of experts (MoE) language model with the characteristics of economical training and efficient inference. It consists of 236B parameters, 21B of which are used to activate each marker. Compared with DeepSeek67B, DeepSeek-V2 has stronger performance, while saving 42.5% of training costs, reducing KV cache by 93.3%, and increasing the maximum generation throughput to 5.76 times. DeepSeek is a company exploring general artificial intelligence

AI subverts mathematical research! Fields Medal winner and Chinese-American mathematician led 11 top-ranked papers | Liked by Terence Tao AI subverts mathematical research! Fields Medal winner and Chinese-American mathematician led 11 top-ranked papers | Liked by Terence Tao Apr 09, 2024 am 11:52 AM

AI is indeed changing mathematics. Recently, Tao Zhexuan, who has been paying close attention to this issue, forwarded the latest issue of "Bulletin of the American Mathematical Society" (Bulletin of the American Mathematical Society). Focusing on the topic "Will machines change mathematics?", many mathematicians expressed their opinions. The whole process was full of sparks, hardcore and exciting. The author has a strong lineup, including Fields Medal winner Akshay Venkatesh, Chinese mathematician Zheng Lejun, NYU computer scientist Ernest Davis and many other well-known scholars in the industry. The world of AI has changed dramatically. You know, many of these articles were submitted a year ago.

Google is ecstatic: JAX performance surpasses Pytorch and TensorFlow! It may become the fastest choice for GPU inference training Google is ecstatic: JAX performance surpasses Pytorch and TensorFlow! It may become the fastest choice for GPU inference training Apr 01, 2024 pm 07:46 PM

The performance of JAX, promoted by Google, has surpassed that of Pytorch and TensorFlow in recent benchmark tests, ranking first in 7 indicators. And the test was not done on the TPU with the best JAX performance. Although among developers, Pytorch is still more popular than Tensorflow. But in the future, perhaps more large models will be trained and run based on the JAX platform. Models Recently, the Keras team benchmarked three backends (TensorFlow, JAX, PyTorch) with the native PyTorch implementation and Keras2 with TensorFlow. First, they select a set of mainstream

Hello, electric Atlas! Boston Dynamics robot comes back to life, 180-degree weird moves scare Musk Hello, electric Atlas! Boston Dynamics robot comes back to life, 180-degree weird moves scare Musk Apr 18, 2024 pm 07:58 PM

Boston Dynamics Atlas officially enters the era of electric robots! Yesterday, the hydraulic Atlas just "tearfully" withdrew from the stage of history. Today, Boston Dynamics announced that the electric Atlas is on the job. It seems that in the field of commercial humanoid robots, Boston Dynamics is determined to compete with Tesla. After the new video was released, it had already been viewed by more than one million people in just ten hours. The old people leave and new roles appear. This is a historical necessity. There is no doubt that this year is the explosive year of humanoid robots. Netizens commented: The advancement of robots has made this year's opening ceremony look like a human, and the degree of freedom is far greater than that of humans. But is this really not a horror movie? At the beginning of the video, Atlas is lying calmly on the ground, seemingly on his back. What follows is jaw-dropping

KAN, which replaces MLP, has been extended to convolution by open source projects KAN, which replaces MLP, has been extended to convolution by open source projects Jun 01, 2024 pm 10:03 PM

Earlier this month, researchers from MIT and other institutions proposed a very promising alternative to MLP - KAN. KAN outperforms MLP in terms of accuracy and interpretability. And it can outperform MLP running with a larger number of parameters with a very small number of parameters. For example, the authors stated that they used KAN to reproduce DeepMind's results with a smaller network and a higher degree of automation. Specifically, DeepMind's MLP has about 300,000 parameters, while KAN only has about 200 parameters. KAN has a strong mathematical foundation like MLP. MLP is based on the universal approximation theorem, while KAN is based on the Kolmogorov-Arnold representation theorem. As shown in the figure below, KAN has

FisheyeDetNet: the first target detection algorithm based on fisheye camera FisheyeDetNet: the first target detection algorithm based on fisheye camera Apr 26, 2024 am 11:37 AM

Target detection is a relatively mature problem in autonomous driving systems, among which pedestrian detection is one of the earliest algorithms to be deployed. Very comprehensive research has been carried out in most papers. However, distance perception using fisheye cameras for surround view is relatively less studied. Due to large radial distortion, standard bounding box representation is difficult to implement in fisheye cameras. To alleviate the above description, we explore extended bounding box, ellipse, and general polygon designs into polar/angular representations and define an instance segmentation mIOU metric to analyze these representations. The proposed model fisheyeDetNet with polygonal shape outperforms other models and simultaneously achieves 49.5% mAP on the Valeo fisheye camera dataset for autonomous driving

Tesla robots work in factories, Musk: The degree of freedom of hands will reach 22 this year! Tesla robots work in factories, Musk: The degree of freedom of hands will reach 22 this year! May 06, 2024 pm 04:13 PM

The latest video of Tesla's robot Optimus is released, and it can already work in the factory. At normal speed, it sorts batteries (Tesla's 4680 batteries) like this: The official also released what it looks like at 20x speed - on a small "workstation", picking and picking and picking: This time it is released One of the highlights of the video is that Optimus completes this work in the factory, completely autonomously, without human intervention throughout the process. And from the perspective of Optimus, it can also pick up and place the crooked battery, focusing on automatic error correction: Regarding Optimus's hand, NVIDIA scientist Jim Fan gave a high evaluation: Optimus's hand is the world's five-fingered robot. One of the most dexterous. Its hands are not only tactile

DualBEV: significantly surpassing BEVFormer and BEVDet4D, open the book! DualBEV: significantly surpassing BEVFormer and BEVDet4D, open the book! Mar 21, 2024 pm 05:21 PM

This paper explores the problem of accurately detecting objects from different viewing angles (such as perspective and bird's-eye view) in autonomous driving, especially how to effectively transform features from perspective (PV) to bird's-eye view (BEV) space. Transformation is implemented via the Visual Transformation (VT) module. Existing methods are broadly divided into two strategies: 2D to 3D and 3D to 2D conversion. 2D-to-3D methods improve dense 2D features by predicting depth probabilities, but the inherent uncertainty of depth predictions, especially in distant regions, may introduce inaccuracies. While 3D to 2D methods usually use 3D queries to sample 2D features and learn the attention weights of the correspondence between 3D and 2D features through a Transformer, which increases the computational and deployment time.

See all articles