You can modify robots.txt to disable search engine crawling It is a bit difficult to disable personal crawling. You can only increase the difficulty, such as adding more complex verification codes, access frequency, regular style/data format changes, etc.
If you are defending against targeted crawlers, you can set some access restrictions, such as access frequency, add verification codes, etc.
Important content is dynamically added using js
limit http_referer
Different interfaces can consider different templates, the kind that a set of regular expressions cannot perfectly match
Add some random copyright information to content that may be crawled
. You can only visit after logging in
Record access log
That’s all I can think of, but if you really want to catch it, these will just make it a little more difficult to catch it
You can modify robots.txt to disable search engine crawling
It is a bit difficult to disable personal crawling. You can only increase the difficulty, such as adding more complex verification codes, access frequency, regular style/data format changes, etc.