


Application of text emotion recognition technology based on deep learning in 5G bad news security management and control platform
Author | Sun Yue, Unit: China Mobile (Hangzhou) Information Technology Co., Ltd. | China Mobile Hangzhou R&D Center
Labs Introduction
With the development of 5G network As it continues to gain popularity, a large number of users are beginning to come into contact with and use 5G networks. 5G networks can not only transmit voice, video, text and other information of traditional networks, but can also be used in more practical application scenarios with lower latency and high-precision positioning capabilities, such as: live battlefield information, satellite Positioning, navigation, etc.
Internet information is often mixed with bad information, such as political-related information, pornographic information, and gang-related information , fraudulent information, commercial advertising information, etc., and the number of bad information is increasing year by year, causing huge harassment to users. In order to purify the network environment and effectively control the spread of bad information, China Mobile's 5G bad news security management and control platform came into being.
Data source: China Mobile Group Information Security Center
##1. Application scenarios of the 5G bad information management and control platform##When faced with a complex network information environment, this platform Such as text messages, voice messages, video messages, rich media messages, etc., classify the messages into: politics-related, pornographic, gang-related, fraud-related, commercial advertising messages, normal messages, etc., and then intercept them in a timely manner through corresponding strategies. And follow-up punishment will be carried out according to the severity of the bad news, so as to purify the network environment from the root and create a good network space.
2. Existing 5G bad information management and control platform technology Key points
##The platform mainly intercepts bad information through the following methods:
①Set first-level keywords
: First-level keywords are usually set to some extremely sensitive words. If the user sends a message containing first-level keyword content, the message will be intercepted immediately. , the information content cannot be delivered, and the user is marked.
② Set common keywords
: Common keywords are set to some more sensitive words. If the user sends a message that contains common keyword content, and within a certain period of time Within a certain period of time, if the number of times the user sends the sensitive message exceeds the system's preset interception threshold, the system will pull the user into the blacklist, and within a certain period of time, the user will not be able to use full 5G network services.
③Set complex text information monitoring
: If the user sends a PDF file, which contains text and pictures, extract the text in the file and filter it Advanced keywords and ordinary keyword mechanisms, and pictures are filtered by rich media mechanisms. According to the filtering results of text and pictures respectively, the principle of heavy processing is adopted as the processing result of the file.3. Technical weaknesses of the existing 5G bad management and control platform
The filtering mechanism of the existing 5G bad news security control platform can only filter specified and limited phrases and short sentences. With the popularity of the Internet, new words will emerge in large numbers every day, and only manual addition is required. Vocabulary, it is no longer possible to update the vocabulary library in a timely and rapid manner. Moreover, when a large number of users today send text messages, although the entire text message does not contain illegal words, the thoughts and emotions expressed may contain a large number of negative emotional tendencies. Words and short sentences alone cannot successfully intercept negative emotional content. Therefore, using text sentiment analysis to submit sentences rich in negative emotional tendencies for review and interception can further strengthen the effect of bad information control and reduce the erosion and poisoning of users by spam information. By establishing a text emotion library containing popular Internet phrases and news messages, the emotions rich in the text are divided into three categories: positive emotions, neutral emotions, and negative emotions, and Add corresponding labels to each text according to these three categories, and use the deep learning network to train the text in the emotional library. The trained model can be used in the 5G bad news management and control platform to intercept bad emotional messages. 4. Technical implementation details of 5G defect management and control system based on deep learning This technology contains three major subjects: jieba word segmentation system, phrase vectorization, and text emotion recognition algorithm. The interaction between each subject is as follows:
Interaction flow chart of each module
Use crawler technology to crawl Internet words and news messages as original text, and divide the original text into a training set and a test set in a ratio of 8:2, label the text information in the training set, and then divide the text in the test set into The information is segmented through jieba word segmentation tool, for example: He came to Mobile Hangyan Building. After word segmentation through the jieba word segmentation tool, the result is: he/came/moved/Hangyan/building, and finally the data after word segmentation was organized into a corpus. Since the amount of text information in the training set and test set is very large (usually millions of data), the amount of data in the post-word segmentation corpus will also be very large (tens of millions of data). Although these corpora can be stored in a numbered form in the corpus, due to the huge amount of data, it is easy to suffer from the disaster of dimensionality. Therefore, for the modal particles that appear in text information, such as: "le", "的", "我", etc., although these words appear very frequently, they have little contribution to the emotional effect, so we will choose to eliminate these words from the corpus Phrases to achieve the purpose of reducing dimensions.
We send the vectorized phrases in the training set into the deep learning network for learning and training, obtain the corresponding model, and finally put the data in the test set into the model to view the corresponding recognition As a result, when the model can obtain a better accuracy rate, the model is connected to the 5G bad management and control platform, and the user sends end-to-end information for filtering. During the filtering process, if bad information is found, it will be intercepted in a timely manner, making the 5G bad information management and control system's interception of bad information more systematic and comprehensive.
Specific steps are as follows:
- Crawl the original text corpus from the Internet and preprocess the original text, including: removing modal particles, deleting punctuation marks and blank areas that appear in the text, deleting terminators, sparse words and specific words that appear in the text; use The jieba library performs word segmentation and accurately cuts text sentences into separate phrases;
- divides the crawled text data set into a training set and a test set according to a certain proportion. Text sentences are manually annotated and divided into: positive emotions, negative emotions, and neutral emotions. And use the jieba library to segment the text sentences in the training set and the test set respectively, and construct the segmented training set into a corpus;
- vectorize the phrases in step 1, so that each segmentation is mapped into a multi-dimensional Continuous-valued vectors to obtain the word vector matrix of the entire data set.
- By first extracting the clause where the emotional word is located, the complexity of the sentence is reduced, and then the position of the emotional object is predicted in the clause based on various features, and then the emotion is extracted from the corresponding position. Emotion extraction is to obtain valuable emotional information in text and determine the role a word or phrase plays in emotional expression, including tasks such as emotional expresser identification, evaluation object identification, and emotional viewpoint word identification.
- By sending the emotion vectors obtained by the above operations into the deep learning network to obtain a text emotion recognition model, then send the emotion vectors in the test set into the model, check the test results, and continue with the data with normal detection results. Perform regular policy filtering, such as text matching, rich media recognition, etc.
5. Advantages of 5G interception system incorporating deep learning
Compared with the existing 5G interception system, the 5G interception system integrated with deep learning has the following advantages:
- Using deep learning The technology provides effective identification with high reliability and authenticity;
- uses deep learning technology for emotion recognition, with less manual intervention and high work efficiency;
- uses text emotion recognition to effectively supplement key The shortcomings of word interception;
- Using text emotion recognition, the strategy can be automatically updated and supplemented with new entry information in a timely manner to improve efficiency.
Write at the end:
At present, the application field of deep learning is very broad, relying on its repeated training and self-learning methods. It can greatly reduce manual workload and improve efficiency and accuracy. Not only is it suitable for the above-mentioned bad information interception system, I believe that in the near future, this technology will also shine in other emerging fields. Of course, deep learning itself is not perfect and cannot solve all thorny problems. Because of this, we should continue to invest deep learning technology in new scenarios and new fields in order to achieve new breakthroughs and create a better future smart life.
The above is the detailed content of Application of text emotion recognition technology based on deep learning in 5G bad news security management and control platform. For more information, please follow other related articles on the PHP Chinese website!

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

AI Hentai Generator
Generate AI Hentai for free.

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics

BERT is a pre-trained deep learning language model proposed by Google in 2018. The full name is BidirectionalEncoderRepresentationsfromTransformers, which is based on the Transformer architecture and has the characteristics of bidirectional encoding. Compared with traditional one-way coding models, BERT can consider contextual information at the same time when processing text, so it performs well in natural language processing tasks. Its bidirectionality enables BERT to better understand the semantic relationships in sentences, thereby improving the expressive ability of the model. Through pre-training and fine-tuning methods, BERT can be used for various natural language processing tasks, such as sentiment analysis, naming

Activation functions play a crucial role in deep learning. They can introduce nonlinear characteristics into neural networks, allowing the network to better learn and simulate complex input-output relationships. The correct selection and use of activation functions has an important impact on the performance and training results of neural networks. This article will introduce four commonly used activation functions: Sigmoid, Tanh, ReLU and Softmax, starting from the introduction, usage scenarios, advantages, disadvantages and optimization solutions. Dimensions are discussed to provide you with a comprehensive understanding of activation functions. 1. Sigmoid function Introduction to SIgmoid function formula: The Sigmoid function is a commonly used nonlinear function that can map any real number to between 0 and 1. It is usually used to unify the

Written previously, today we discuss how deep learning technology can improve the performance of vision-based SLAM (simultaneous localization and mapping) in complex environments. By combining deep feature extraction and depth matching methods, here we introduce a versatile hybrid visual SLAM system designed to improve adaptation in challenging scenarios such as low-light conditions, dynamic lighting, weakly textured areas, and severe jitter. sex. Our system supports multiple modes, including extended monocular, stereo, monocular-inertial, and stereo-inertial configurations. In addition, it also analyzes how to combine visual SLAM with deep learning methods to inspire other research. Through extensive experiments on public datasets and self-sampled data, we demonstrate the superiority of SL-SLAM in terms of positioning accuracy and tracking robustness.

Latent Space Embedding (LatentSpaceEmbedding) is the process of mapping high-dimensional data to low-dimensional space. In the field of machine learning and deep learning, latent space embedding is usually a neural network model that maps high-dimensional input data into a set of low-dimensional vector representations. This set of vectors is often called "latent vectors" or "latent encodings". The purpose of latent space embedding is to capture important features in the data and represent them into a more concise and understandable form. Through latent space embedding, we can perform operations such as visualizing, classifying, and clustering data in low-dimensional space to better understand and utilize the data. Latent space embedding has wide applications in many fields, such as image generation, feature extraction, dimensionality reduction, etc. Latent space embedding is the main

In today's wave of rapid technological changes, Artificial Intelligence (AI), Machine Learning (ML) and Deep Learning (DL) are like bright stars, leading the new wave of information technology. These three words frequently appear in various cutting-edge discussions and practical applications, but for many explorers who are new to this field, their specific meanings and their internal connections may still be shrouded in mystery. So let's take a look at this picture first. It can be seen that there is a close correlation and progressive relationship between deep learning, machine learning and artificial intelligence. Deep learning is a specific field of machine learning, and machine learning

Almost 20 years have passed since the concept of deep learning was proposed in 2006. Deep learning, as a revolution in the field of artificial intelligence, has spawned many influential algorithms. So, what do you think are the top 10 algorithms for deep learning? The following are the top algorithms for deep learning in my opinion. They all occupy an important position in terms of innovation, application value and influence. 1. Deep neural network (DNN) background: Deep neural network (DNN), also called multi-layer perceptron, is the most common deep learning algorithm. When it was first invented, it was questioned due to the computing power bottleneck. Until recent years, computing power, The breakthrough came with the explosion of data. DNN is a neural network model that contains multiple hidden layers. In this model, each layer passes input to the next layer and

Convolutional Neural Network (CNN) and Transformer are two different deep learning models that have shown excellent performance on different tasks. CNN is mainly used for computer vision tasks such as image classification, target detection and image segmentation. It extracts local features on the image through convolution operations, and performs feature dimensionality reduction and spatial invariance through pooling operations. In contrast, Transformer is mainly used for natural language processing (NLP) tasks such as machine translation, text classification, and speech recognition. It uses a self-attention mechanism to model dependencies in sequences, avoiding the sequential computation in traditional recurrent neural networks. Although these two models are used for different tasks, they have similarities in sequence modeling, so

RMSprop is a widely used optimizer for updating the weights of neural networks. It was proposed by Geoffrey Hinton et al. in 2012 and is the predecessor of the Adam optimizer. The emergence of the RMSprop optimizer is mainly to solve some problems encountered in the SGD gradient descent algorithm, such as gradient disappearance and gradient explosion. By using the RMSprop optimizer, the learning rate can be effectively adjusted and the weights adaptively updated, thereby improving the training effect of the deep learning model. The core idea of the RMSprop optimizer is to perform a weighted average of gradients so that gradients at different time steps have different effects on weight updates. Specifically, RMSprop calculates the square of each parameter
