How can the front-end prevent collection as much as possible? What are some good implementation solutions?
How can the front-end prevent collection as much as possible? What are some good implementation solutions?
Go and learn crawl the websiteor anti-anti-crawler, and then come up with countermeasures one by one, haha
I usually check the referer
, but it’s of no use...
Has no one done any research?
Prevent crawlers from crawling? It seems there is no perfect solution
There is no perfect method. There are some auxiliary methods that block the IP based on the number of IP requests, such as 100 visits in a short period of time. . But there are agents, so it’s useless and can only protect against newbies.
There may be concurrency restrictions, one end can only have 10 concurrencies, etc.
In fact, it’s the same, IP proxy + multi-threading still breaks through the concurrency limit, so it’s just for newbies.
ajax obtains data and displays it; general collection does not support js execution
Add garbled symbols, but use div and other tags to prevent the garbled characters from being displayed (such as invisible, smallest font size, and the same color as the background color). This method has been used before on the official website of "Reader".
Whoever can achieve the anti-collection effect on the front end, haha, can win the Nobel Prize in Physics -- By phantomjs
Add hidden controls, including url. The one who accesses this url is the machine