ChatGPT (Chat Generative Pre-trained Transformer) is a chat robot program developed by OpenAI in the United States. It can conduct conversations by understanding and learning human language, and communicate with users based on the context of the chat. Interact, truly chat and communicate like humans. It can even complete tasks such as writing emails, video scripts, copywriting, code, papers, etc.
ChatGPT’s algorithm is based on the Transformer architecture, which is a deep neural network that uses a self-attention mechanism to process input data. The Transformer architecture is widely used in natural language processing tasks such as language translation, text summarization, and question answering. ChatGPT uses the GPT-3.5 Large Language Model (LLM Large Language Model), and based on this model, reinforcement learning is introduced to fine-tune the pre-trained language model. The reinforcement learning here uses RLHF (Reinforcement Learning from Human Feedback), which is a manual annotation method. The purpose is to let the LLM model learn to understand various natural language processing tasks through its reward and punishment mechanism, and learn to judge what kind of answers are high-quality from the three dimensions of helpfulness, honesty, and harmless.
The main training process of the ChatGPT model is as follows:
In the field of security detection, more and more enterprise organizations are beginning to use artificial intelligence technology to help detect networks Potential threats in traffic. The advantage of artificial intelligence is that it can process large amounts of data to quickly and accurately identify and classify abnormal traffic. By training neural network models, artificial intelligence can automatically detect and identify network attacks, vulnerability exploits, malware and other behaviors, reduce manual intervention and false positives, and improve detection accuracy and efficiency.
The core of the current mainstream network attack detection is the detection of HTTP access (WAF) developed based on DPI technology, and the intrusion prevention detection (IPS) of the operating system. That is, it is deployed before the application, scans and filters user requests before they reach the server, analyzes and verifies the network packets requested by each user, ensures the safety and effectiveness of each request, and intercepts or intercepts invalid or offensive requests. isolation. Currently, the commonly used attack detection methods are as follows:
Detects threats in network traffic, such as viruses and malicious code, based on specific rules or patterns (regular expressions) written in advance. software, intrusion, etc. However, due to the diverse attack methods, experienced hackers can bypass detection by changing some statements. Regular expressions are developed from keywords. Although they reduce the false positive rate to a certain extent, because regular expressions are based on string filtering, they can only detect predetermined attack behaviors; for some more complex injections This method also has the problem of high false negative rate.
Through modeling and analysis of basic elements such as the source IP of similar traffic, protocol type proportion, and traffic upward and downward trends, analysis conclusions of some abnormal events can be obtained. However, traffic analysis needs to capture and analyze network traffic, so it requires high computing resources and storage resources, which will make the entire system relatively large.
Detects abnormal activities by monitoring the behavior of network traffic. For example, it is detected that a web application server accesses non-business databases, bursts of large data flows, frequent access attempts, etc., and then discovers potential network threats. In this process, some legitimate activities (such as temporary downloads, etc.) will be falsely reported, and mature behavioral analysis models take a long time to train and learn, so the protection efficiency may be low.
Design the detection engine as a SQL semantic interpreter or command line terminal, try to understand the content input by the user, and determine whether it may constitute an attack. Currently, it is mainly targeted at SQL injection and has limited usage scenarios.
In addition to these usage restrictions based on the DPI engine-based detection method, there are also multiple methods of bypassing the traffic parsing engine for intrusion. For example, taking advantage of the possible HTTP protocol parsing flaws of the DPI engine, it only recognizes port 80 as HTTP traffic, and the web application port is on 8080, and its HTTP traffic will be parsed by the DPI engine as non-HTTP, thereby bypassing application layer attack detection.
We follow the unpacking process of the DPI engine to parse the original traffic into key field data and perform rule matching. If the rule can be matched, it means that the packet contains attack behavior; if it cannot be matched, it means that the risk of the packet is low. The traffic received by the DPI engine is as follows:
The DPI engine will group traffic according to sessions. Messages in the same group are generally the same five-tuple. The request response message:
#The DPI engine will disassemble the traffic according to the protocol level until all fields are parsed.
The DPI engine will extract the plaintext request of the application layer as the content to be detected:
ChatGPT as a The large-scale natural language processing model can understand the original HTTP message information, so that no matter the attack appears in the URL, Cookies or Referer, it can be successfully detected.
ChatGPT, New Bing and other attack judgment modules will call OpenAI related API interfaces and use questions to allow ChatGPT, New Bing, etc. to attack Judgment, the schematic code is as follows:
import openai openai.api_key = "sk-Bew1dsFo3YXoY2***********81AkBHmY48ijxu"# api token 用来认证 def get_answer(prompt, max_tokens): # 定义一个获取答案的函数 try: response = openai.Completion.create( model = "text-davinci-003", # 模型名称 prompt = prompt,# 问题 temperature = 0.7, max_tokens = max_tokens,# 返回内容的长度限制 stream = False, # False就是一次性返回, True 就是一个个打出来像打字机, 返回的是迭代器, 需要后面代码处理. 此处没有处理 所以用False top_p = 1, frequency_penalty = 0, presence_penalty = 0 ) return 0, response['choices'][0]['text'].strip()# 获取返回值关键返回内容 except Exception as e:# 异常处理 return str(e), None
Through the above function, you can achieve the effect similar to asking questions to ChatGPT (the use model is text-davinci-003), as shown below:
ChatGPT will return a clear conclusion as to whether there is an attack behavior and a description of the behavior, thus completing an attack judgment.
As shown in the figure above, a large number of requests that need to be judged in the traffic can be stored in different files, and ChatGPT can perform attack judgment. The sample code is as follows:
def main(read_dir = 'detect'):# 定义main函数 args = []# 缓存列表 global sign_req, all_req# 识别计数 for rf in walk_dir(read_dir, ['.txt']):# 遍历待检测目录 all_req += 1# 总数据包数自增1 content = read_fileA(rf, 'str')[:2048]# 提取报文文件前2048个字符 key_content = content.split('rnrnrn')[0][:1024]# 提取http请求 if len(key_content) < 10: continue# 如果长度太小就不检测 err, sign, disc = judge_attack(key_content, rf_rst)# 调用ChatGPT接口进行攻击检测 if sign: sign_req += 1# 如果检测到攻击, 攻击计数自增1 print('r' + f' 已检测 {all_req: 4} 个报文, 识别到攻击 {sign_req} 个, 检出率: {sign_req/all_req:0.2%}', end='', flush=True) # 打印结论
In this way, batch packet attack detection can be achieved.
The attack samples come from Nuclei's scanning of target machines and full PoC detection, because some requests cannot tell whether there is a threat from a single message.
The above situation may require more context to judge. This time we have removed such request examples that cannot be accurately judged, and try to give some examples that can be accurately judged under artificial conditions. , the overall detection results are as follows:
It can be seen that the accuracy of ChatGPT's traffic detection is very high, which is basically equivalent to a security expert's quick judgment, and its security detection capabilities Worth the wait.
Interested readers can view the complete project source code, the link is: https://github.com/VitoYane/PcapSplit
From the perspective of network security protection, enterprise organizations can take targeted countermeasures, train similar models such as ChatGPT, mark malicious activities and malicious code, and set up guardrails that are difficult to bypass. For threats caused by ChatGPT, new cyber awareness training can be provided to employees to acquire the knowledge to identify social engineering attacks in order to identify phishing attacks created by artificial intelligence tools such as ChatGPT.
Of course this is not enough. Artificial intelligence tools such as ChatGPT will create new threats faster than human criminals, and spread threats faster than cybersecurity personnel can respond. The only way organizations can keep up with this rate of change is to respond to AI with AI.
In summary: On the one hand, researchers, practitioners, academic institutions, and enterprise organizations in the cybersecurity industry can leverage the power of ChatGPT to innovate and collaborate, including vulnerability discovery, incident response, and phishing detection; on the other hand, On the one hand, with the development of tools such as ChatGPT, it will be more important to develop new network security tools in the future. Security vendors should be more active in developing and deploying behavior-based (rather than rule-based) AI security tools to detect AI-generated attacks.
The above is the detailed content of How to use ChatGPT to improve the intelligence level of security detection. For more information, please follow other related articles on the PHP Chinese website!