python - 禁止自己的网站被爬虫爬去?
大家讲道理
大家讲道理 2017-04-17 17:33:35
0
13
1130

禁止自己的网站被爬虫爬去?有什么方法啊

大家讲道理
大家讲道理

光阴似箭催人老,日月如移越少年。

reply all(13)
黄舟

If you are defending against targeted crawlers, you can set some access restrictions, such as access frequency, add verification codes, etc.

阿神
  1. Important content is dynamically added using js

  2. limit http_referer

  3. Different interfaces can consider different templates, the kind that a set of regular expressions cannot perfectly match

  4. Add some random copyright information to content that may be crawled

  5. . You can only visit after logging in

  6. Record access log

That’s all I can think of, but if you really want to catch it, these will just make it a little more difficult to catch it

小葫芦

You can modify robots.txt to disable search engine crawling
It is a bit difficult to disable personal crawling. You can only increase the difficulty, such as adding more complex verification codes, access frequency, regular style/data format changes, etc.

Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template