


Timing Analysis Pentagon Warrior! Tsinghua University proposes TimesNet: leading in prediction, filling, classification, and detection
Achieving task versatility is a core issue in the research of basic deep learning models, and is also one of the main focuses in the recent direction of large models.
However, in the time series field, various types of analysis tasks vary greatly, including prediction tasks that require fine-grained modeling and classification tasks that require extracting high-level semantic information. How to build a unified deep basic model to efficiently complete various timing analysis tasks, there has been no established solution before.
To this end, a team from the School of Software, Tsinghua University conducted research on the basic issue of timing change modeling and proposed TimesNet, a task-universal timing basic model. The paper was accepted by ICLR 2023.
## Author list: Wu Haixu*, Hu Tenge*, Liu Yong*, Zhou Hang, Wang Jianmin, Long Mingsheng
Link: https://openreview.net/pdf?id=ju_Uqw384Oq
Code: https://github.com/thuml/TimesNet
Time series algorithm library: https://github.com/thuml/Time-Series-Library
TimesNet has achieved comprehensive leadership in the five major tasks of long-term and short-term prediction, missing value filling, anomaly detection, and classification.
Different from sequence data such as natural language and video, a single Time only saves some scalars, and its key information is more contained in temporal variation (Temporal Variation).
Therefore, Modeling timing changes is a core issue common to all types of timing analysis tasks.
In recent years, various deep models have been widely used in timing analysis tasks, such as recurrent neural networks (RNN), temporal convolutional networks (TCN) and transformer networks (Transformer).
However, the first two types of methods mainly focus on capturing changes between nearby moments, and have insufficient modeling capabilities in long-term dependencies.
Although Transformer has a natural advantage in modeling long-term dependencies, due to the extremely complex timing changes in the real world, it is difficult to mine them by relying solely on attention between discrete time points. Reliable timing dependencies.
To this end, this article analyzes timing changes from a new multi-periodity perspective, as shown in the figure below. We observe that:
- Time series naturally have multi-periodicity.
Time series data in the real world are often the superposition of different periodic processes. For example, traffic data changes on a daily basis in the short term, while in the long term it changes on a weekly basis. . These data of different periods overlap and interfere with each other, which brings great challenges to time series analysis.
- The time series presents two kinds of time series changes within the cycle and between the cycles.
Specifically, for the process of a specific cycle, the change at each time point is not only related to the adjacent moment, but also highly related to similar processes in the adjacent cycle. Among them, intra-cycle changes correspond to short-term processes, while inter-cycle changes can reflect long-term trends in consecutive cycles. Note: If the time series has no obvious periodicity, it is equivalent to the situation where the period is infinitely long.
2 Design IdeasBased on the above two observations, we designed the structure of TimesNet as follows:
- The multi-periodic nature of time series naturally inspired a modular (Modular) design idea, that is, a module Capture temporal changes dominated by a specific cycle. This modular design idea can decouple complex time changes, which is beneficial to subsequent modeling. For the
- intra-cycle and inter-cycle changes of the time series, this article innovatively proposesExpand one-dimensional time series data to two-dimensional space for analysis. As shown in the figure above, by folding a one-dimensional time series based on multiple cycles, multiple two-dimensional tensors (2D tensors) can be obtained. The columns and rows of each two-dimensional tensor reflect the time series within the cycle and between the cycles respectively. Changes, that is, Temporal 2D-variations are obtained.
Therefore, after folding the time series data, we can directly use advanced
Visual Backbone Networkto perform feature extraction on the time series data, such as Swin Transformer, ResNeXt, ConvNeXt, etc. . This design also allows timing analysis tasks to directly benefit from the booming computer vision field. 3 TimesNet
Based on the above ideas, we proposed the TimesNet model, which decomposes complex timing changes into different periods through a modular structure, and transforms the original one-dimensional time into The sequence is converted into a two-dimensional space
to achieve unified modeling of intra-cycle and inter-cycle changes. In this section, we will first introduce the method of extending time series data to two-dimensional space, and then introduce the overall architecture of the model.
3.1 Timing change: 1D->2D
The process of timing folding is shown in the figure above, which is mainly divided into the following two steps:
(1) Period extractionFor a time length of , channel dimensions of dimensional time series, the period information can be directly extracted by the fast Fourier transform (FFT) of the time dimension, that is:
where represents each The intensity of the frequency component, the frequency with the greatest intensity corresponds to the most significant period length.
(2) Sequence folding 1D->2DFor the selected individual Period, fold the original one-dimensional time series respectively. The process can be formalized as:
where 0 is added to the end of the sequence, Make the sequence length divisible.
Through the above operations, we obtain a set of two-dimensional tensors, which correspond to two-dimensional time series changes with a period of .
3.2 Model Design
The overall architecture of TimesNet is shown in the figure:
Specifically, as shown in the figure below, each TimesBlock contains the following sub-processes:
##(1) Folding time series (1D->2D): TimesBlock first extracts the period from the input one-dimensional time series features, and then converts it into Two-dimensional time series changes, that is, the content covered in the previous section:
(2) Extract two-dimensional time series change representations (2D Representation) : As analyzed previously, the converted two-dimensional time series changes have 2D locality, so 2D convolution can be used directly to extract features. Here, we chose the classic Inception model, namely:
It is worth noting that because we have converted the 1D timing features into the 2D space , so we can also make use of many cutting-edge models in the field of computer vision, such as ResNeXt, ConvNeXt, Attention-based Swin Transformer, etc. This enables time series analysis to work hand-in-hand with the visual backbone network.
(3) Expand time series (2D->1D): For subsequent multi-period fusion, we expand the two-dimensional time series change representation into one-dimensional space:
Trunc(⋅) means to remove the 0 added by the Padding(⋅) operation in step (1).
(4) Adaptive fusion (1D Aggregation) : In order to fuse multi-period information, we perform a weighted summation of the extracted two-dimensional time series representations, and select The summation weight of is the corresponding frequency intensity obtained in step (1):
By converting the 1D time series For the design of 2D space, TimesNet implements the timing change modeling process of "extracting two-dimensional timing changes in multiple cycles and then performing adaptive fusion."
4 ExperimentWe conducted experiments on five major tasks: long-term prediction, short-term prediction, missing value filling, anomaly detection, and classification, covering 36 data sets, 81 different experimental settings.
At the same time, 19 different depth methods were compared, including the latest ones based on RNN, CNN, MLP, and Transformer Models such as N-BEATS (2019), Autoformer (2021), LSSL (2022), N-Hits (2022), FEDformer (2022), Dlinear (2023), etc.
4.1 Overall results
As shown in the opening radar chart, TimesNet achieved SOTA on all five tasks.
(1) Long-term prediction: On this high-profile task, TimesNet surpasses state-of-the-art Transformer and MLP-based models.
(2) Short-term prediction: The M4 data set used in this experiment contains 6 sub-datasets with different sampling frequencies, totaling more than 100,000 pieces of data. TimesNet still achieved optimal results in this complex data distribution situation, verifying the model's temporal change modeling capabilities.
(3) Classification task: On this task, TimesNet surpasses the classic Rocket algorithm and cutting-edge deep learning models Flowformer.
For more comparisons of tasks, please see the paper.
4.2 Generalization of the visual backbone network
We replace the Inception network in TimesNet with a different visual backbone network, For example ResNet, ConvNext, Swin Transformer, etc.
As shown in the figure below, a more advanced visual backbone network can bring better results. This also means that under the framework of TimesNet, time series analysis can directly benefit from advances in the field of visual backbone networks.
4.3 Representation Analysis
In order to further explore the source of the effect of TimesNet, We show the relationship between "CKA similarity between the bottom-level representation of the model" and "model effect". Among them, the lower the CKA similarity, the greater the representation difference between the bottom layer and the top layer of the model, that is, a more hierarchical representation.
# From the above visualization, we can observe :
- #In prediction and anomaly detection tasks, the better the model is, the higher the representation similarity between the bottom layer and the top layer, indicating that the task requires more Low-level representations;
- #In classification and missing value filling tasks, the better the model is, the lower the similarity between the bottom-level representation, indicating that this task requires hierarchical representation, that is, better global feature extraction capabilities.
Thanks to the convolution operation in 2D space, TimesNet can learn appropriate representations according to different tasks, such as prediction and anomaly detection tasks, learning low-level representations; In classification and missing value filling tasks, hierarchical abstract features are learned. This further proves the task generalization of TimesNet as a basic model.
At the same time, the above representation analysis also provides design ideas for deep models for specific tasks. For example, for prediction tasks, we need to focus on the extraction of underlying fine-grained features, and for filling tasks, we need to further Learning that takes global representation into account.
5 SummaryInspired by the multi-period nature of time series, this article proposes a basic model for task-universal time series analysis - TimesNet. This model innovatively folds one-dimensional time series into two-dimensional space and uses 2D convolution to obtain time series features. This innovation allows timing analysis tasks to directly benefit from the booming visual backbone network, which is very inspiring for subsequent research.
At the same time, TimesNet has achieved comprehensive leadership in the five mainstream time series analysis tasks of long-term and short-term prediction, missing value filling, anomaly detection, and classification, and has excellent application value.
The above is the detailed content of Timing Analysis Pentagon Warrior! Tsinghua University proposes TimesNet: leading in prediction, filling, classification, and detection. For more information, please follow other related articles on the PHP Chinese website!

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

AI Hentai Generator
Generate AI Hentai for free.

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics



Imagine an artificial intelligence model that not only has the ability to surpass traditional computing, but also achieves more efficient performance at a lower cost. This is not science fiction, DeepSeek-V2[1], the world’s most powerful open source MoE model is here. DeepSeek-V2 is a powerful mixture of experts (MoE) language model with the characteristics of economical training and efficient inference. It consists of 236B parameters, 21B of which are used to activate each marker. Compared with DeepSeek67B, DeepSeek-V2 has stronger performance, while saving 42.5% of training costs, reducing KV cache by 93.3%, and increasing the maximum generation throughput to 5.76 times. DeepSeek is a company exploring general artificial intelligence

AI is indeed changing mathematics. Recently, Tao Zhexuan, who has been paying close attention to this issue, forwarded the latest issue of "Bulletin of the American Mathematical Society" (Bulletin of the American Mathematical Society). Focusing on the topic "Will machines change mathematics?", many mathematicians expressed their opinions. The whole process was full of sparks, hardcore and exciting. The author has a strong lineup, including Fields Medal winner Akshay Venkatesh, Chinese mathematician Zheng Lejun, NYU computer scientist Ernest Davis and many other well-known scholars in the industry. The world of AI has changed dramatically. You know, many of these articles were submitted a year ago.

The performance of JAX, promoted by Google, has surpassed that of Pytorch and TensorFlow in recent benchmark tests, ranking first in 7 indicators. And the test was not done on the TPU with the best JAX performance. Although among developers, Pytorch is still more popular than Tensorflow. But in the future, perhaps more large models will be trained and run based on the JAX platform. Models Recently, the Keras team benchmarked three backends (TensorFlow, JAX, PyTorch) with the native PyTorch implementation and Keras2 with TensorFlow. First, they select a set of mainstream

Boston Dynamics Atlas officially enters the era of electric robots! Yesterday, the hydraulic Atlas just "tearfully" withdrew from the stage of history. Today, Boston Dynamics announced that the electric Atlas is on the job. It seems that in the field of commercial humanoid robots, Boston Dynamics is determined to compete with Tesla. After the new video was released, it had already been viewed by more than one million people in just ten hours. The old people leave and new roles appear. This is a historical necessity. There is no doubt that this year is the explosive year of humanoid robots. Netizens commented: The advancement of robots has made this year's opening ceremony look like a human, and the degree of freedom is far greater than that of humans. But is this really not a horror movie? At the beginning of the video, Atlas is lying calmly on the ground, seemingly on his back. What follows is jaw-dropping

Earlier this month, researchers from MIT and other institutions proposed a very promising alternative to MLP - KAN. KAN outperforms MLP in terms of accuracy and interpretability. And it can outperform MLP running with a larger number of parameters with a very small number of parameters. For example, the authors stated that they used KAN to reproduce DeepMind's results with a smaller network and a higher degree of automation. Specifically, DeepMind's MLP has about 300,000 parameters, while KAN only has about 200 parameters. KAN has a strong mathematical foundation like MLP. MLP is based on the universal approximation theorem, while KAN is based on the Kolmogorov-Arnold representation theorem. As shown in the figure below, KAN has

Target detection is a relatively mature problem in autonomous driving systems, among which pedestrian detection is one of the earliest algorithms to be deployed. Very comprehensive research has been carried out in most papers. However, distance perception using fisheye cameras for surround view is relatively less studied. Due to large radial distortion, standard bounding box representation is difficult to implement in fisheye cameras. To alleviate the above description, we explore extended bounding box, ellipse, and general polygon designs into polar/angular representations and define an instance segmentation mIOU metric to analyze these representations. The proposed model fisheyeDetNet with polygonal shape outperforms other models and simultaneously achieves 49.5% mAP on the Valeo fisheye camera dataset for autonomous driving

The latest video of Tesla's robot Optimus is released, and it can already work in the factory. At normal speed, it sorts batteries (Tesla's 4680 batteries) like this: The official also released what it looks like at 20x speed - on a small "workstation", picking and picking and picking: This time it is released One of the highlights of the video is that Optimus completes this work in the factory, completely autonomously, without human intervention throughout the process. And from the perspective of Optimus, it can also pick up and place the crooked battery, focusing on automatic error correction: Regarding Optimus's hand, NVIDIA scientist Jim Fan gave a high evaluation: Optimus's hand is the world's five-fingered robot. One of the most dexterous. Its hands are not only tactile

FP8 and lower floating point quantification precision are no longer the "patent" of H100! Lao Huang wanted everyone to use INT8/INT4, and the Microsoft DeepSpeed team started running FP6 on A100 without official support from NVIDIA. Test results show that the new method TC-FPx's FP6 quantization on A100 is close to or occasionally faster than INT4, and has higher accuracy than the latter. On top of this, there is also end-to-end large model support, which has been open sourced and integrated into deep learning inference frameworks such as DeepSpeed. This result also has an immediate effect on accelerating large models - under this framework, using a single card to run Llama, the throughput is 2.65 times higher than that of dual cards. one
