Use lexical analysis to extract domain names and IPs
Background
When analyzing the logs, I found that some log parameters contained other URLs, for example:
https://blog.csdn.net/breaksoftware/article/details/7009209. If you are interested, you can take a look. Facts have proved that following the master really improves your posture.
The original text is in C version, here I wrote a similar one in Python for your reference. Common URL classificationwww.baidu.com.
while (i < len(z) and z[i].isdigit()): i = i + 1 ip_v1 = True reti = i if i < len(z) and z[i] == '.': i = i + 1 reti = i else: tokenType = TK_OTHER reti = 1while (i < len(z) and z[i].isdigit()): i = i + 1 ip_v2 = True if i < len(z) and z[i] == '.': i = i + 1 else: if tokenType != TK_DOMAIN: tokenType = TK_OTHER reti = 1while (i < len(z) and z[i].isdigit()): i = i + 1 ip_v3 = True if i < len(z) and z[i] == '.': i = i + 1 else: if tokenType != TK_DOMAIN: tokenType = TK_OTHER reti = 1while (i < len(z) and z[i].isdigit()): i = i + 1 ip_v4 = True if i < len(z) and z[i] == ':': i = i + 1 while (i < len(z) and z[i].isdigit()): i = i + 1 if ip_v1 and ip_v2 and ip_v3 and ip_v4: self.urls.append(z[0:i]) return reti, tokenType else: if tokenType != TK_DOMAIN: tokenType = TK_OTHER reti = 1
Scan the first half of 1234, which conforms to the characteristics of the IP form, but it is found that the code will report an exception, so the IP processing code segment needs to be added to determine whether the suffix is a top-level domain name:
https://github.com/skskevin/UrlDetect/blob/master/tool/domainExtract/domainExtract.py
The above is the detailed content of Use lexical analysis to extract domain names and IPs. For more information, please follow other related articles on the PHP Chinese website!

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

AI Hentai Generator
Generate AI Hentai for free.

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics



1. Black/white list IP restricted access configuration nginx There are several ways to configure black and white lists. Here are only two commonly used methods. 1. The first method: allow, denydeny and allow instructions belong to ngx_http_access_module. nginx loads this module by default, so it can be used directly. This method is the simplest and most direct. The setting is similar to the firewall iptable. How to use: Add directly to the configuration file: #Whitelist settings, followed by allow is accessible IPlocation/{allow123.13.123.12;allow23.53.32.1/100;denyall;}#Blacklist settings,

IP and mac binding refers to associating a specific IP address with a specific MAC address, so that only the device using the MAC address can use the IP address for network communication. Binding ip and mac can prevent the IP address of the bound host from being spoofed. Prerequisites: 1. The MAC address is unique and cannot be spoofed; it can only be bound to hosts on the network directly connected to the router (that is, The host's gateway is on the router).

In the TCP/IP protocol suite, Domain Name System is one of the protocols that provides name resolution services for mapping computer names to IP addresses. However, sometimes it malfunctions, resulting in errors such as The requested control is not valid for this service NETHELPMSG2191. DNS clients and servers work together to provide computer name to IP address mapping name resolution services for computers and users. After installing Windows, client and server versions of the operating system have the client service enabled by default. Once you specify the server's IP address in your TCP/IP network configuration, the DNS client queries the server to discover domain controllers and resolve computer names to IP addresses. only in service

How to check the IP address on WeChat: 1. Log in to the computer version of WeChat, right-click the taskbar at the bottom of the screen, and click "Task Manager"; 2. When the task manager pops up, click "Details" in the lower left corner; 3. Task management Enter the "Performance" option of the browser and click "Open Resource Monitor"; 4. Select "Network" and check the WeChat process "Wechat.exe"; 5. Click "TCP Connection" below to monitor the WeChat network IP related situation. Sending a message and getting a reply will reveal the other person's IP address.

1. Set the directory whitelist: There is no restriction on the specified request path. If there is no restriction on the request path to the api directory, it can be written as server{location/app{proxy_passhttp://192.168.1.111:8095/app ;limit_connconn20;limit_rate500k;limit_reqzone=fooburst=5nodelay;}location/app/api{proxy_passhttp://192.168.1.111:8095/app/api}}#Because nginx will give priority to accurate matching

Many times, very large files are difficult to share between devices, especially smartphones and the like. Therefore, these files are first archived/compressed into RAR files and then sent to another device for sharing. But the problem is that RAR files are not easy to extract on iPhone. To extract a zip file, it only takes one tap. Not many people know the process of extracting RAR files on iPhone, and for beginners, the steps can be confusing. This can be done using the default apps on your iPhone called Shortcuts. Here we explain step by step how to extract any RAR file on iPhone using Shortcuts app. How to Extract RAR Files on iPhone Step 1: First, you

Concept: uv (uniquevisitor): unique visitor, each independent Internet computer (based on cookies) is regarded as a visitor, and the number of visitors who visit your website within a day (00:00-24:00). Visits to the same cookie within a day are only counted once PV (pageview): visits, that is, page views or clicks, each visit to the website by the user is recorded once. When a user visits the same page multiple times, the total number of visits is counted. Independent IP: The same IP address is only counted once within 00:00-24:00. Friends who do website optimization are most concerned about this. Let me first state the environment. This run nginx version 1.7, the backend tomcat runs dynamic

Solution to wifi showing no IP allocation: 1. Restart the device and router, turn off the Wi-Fi connection on the device, turn off the device, turn off the router, wait a few minutes, then reopen the router to connect to wifi; 2. Check the router settings and restart DHCP, make sure the DHCP function is enabled; 3. Reset network settings, which will delete all saved WiFi networks and passwords. Please make sure they are backed up before performing this operation; 4. Update the router firmware, log in to the router management interface, and find the firmware Update options and follow the prompts.
