Python counts client IP visits based on nginx access logs

WBOY
Release: 2016-08-08 09:22:21
Original
1226 people have browsed it

Professional statistical websites, such as Baidu Statistics, Google Analytics, cnzz and other statistical backends provide commonly used statistical indicators for webmasters, such as uv, pv, online time, ip, etc. In addition, due to network reasons, I found that Google Analytics is better than Baidu counts hundreds of IPs, so I want to write my own script to understand the real number of visits. However, the access logs based on nginx will be much larger than the statistical backend, because many spider visits will also be counted. There are also static file statistics. In fact, if the algorithm is improved, those useless statistical data can be filtered out. Today I will share the most basic statistics with you, and also to learn and review the python language.

For example, the nginx log on the server is as follows:

221.221.155.54 - - [02/Aug/2014:15:16:11 +0800] "GET / HTTP/1.1" 200 8482 "http://www. zuidaima.com/" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/31.0.1650.57 Safari/537.36" "-" "0.020"
221.221.155.53 - - [02/Aug /2014:15:16:11 +0800] "GET / HTTP/1.1" 200 8482 "http://www.zuidaima.com/" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/31.0.1650.57 Safari/537.36" "-" "0.020"
221.221.155.54 - - [02/Aug/2014:15:16:11 +0800] "GET / HTTP/1.1" 200 8482 "http: //www.zuidaima.com/" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/31.0.1650.57 Safari/537.36" "-" "0.020"

The statistical script is as follows:

stat_ip.py

#encoding=utf8
import re
zuidaima_nginx_log_path="/usr/local/nginx/logs/www.zuidaima.com.access.log"
pattern = re.compile(r'^d{1, 3}.d{1,3}.d{1,3}.d{1,3}')
def stat_ip_views(log_path):
ret={}
f = open(log_path, "r")
for line in f:
                                                                                                                                                                                                                                                                                                  else:
                   views=0
             views=views +1
           ret[ip]=views
                                                                                                                                                                            ,,,,,,,,,,,,,,, views[ip]
If len(max_ip_view)= =0:
               max_ip_view[ip]=views                                                    ;_views:
                                                                max_ip_view[ip]=views )
        print "ip:", ip, ",views:", views
    #总共有多少ip
    print "total:", len(ip_views)
    #最大访问的ip
    print "max_ip_view:", max_ip_view
run ()
The running results are as follows:


ip: 221.221.155.53, views: 1
ip: 221.221.155.54, views: 2
total: 2
max_ip_view: {'221.221.155.54': 2}

all ip The number of visits and the maximum IP visits.


The above introduces Python's statistics of client IP visits based on nginx access logs, including aspects of it. I hope it will be helpful to friends who are interested in PHP tutorials.


Related labels:
source:php.cn
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template