Python脚本实现下载合并SAE日志
由于一些原因,需要SAE上站点的日志文件,从SAE上只能按天下载,下载下来手动处理比较蛋疼,尤其是数量很大的时候。还好SAE提供了API可以批量获得日志文件下载地址,刚刚写了python脚本自动下载和合并这些文件
调用API获得下载地址
文档位置在这里
设置自己的应用和下载参数
请求中需要设置的变量如下
api_url = 'http://dloadcenter.sae.sina.com.cn/interapi.php?'
appname = 'xxxxx'
from_date = '20140101'
to_date = '20140116'
url_type = 'http' # http|taskqueue|cron|mail|rdc
url_type2 = 'access' # only when type=http access|debug|error|warning|notice|resources
secret_key = 'xxxxx'
生成请求地址
请求地址生成方式可以看一下官网的要求:
1.将参数排序
2.生成请求字符串,去掉&
3.附加access_key
4.请求字符串求md5,形成sign
5.把sign增加到请求字符串中
具体实现代码如下
params = dict()
params['act'] = 'log'
params['appname'] = appname
params['from'] = from_date
params['to'] = to_date
params['type'] = url_type
if url_type == 'http':
params['type2'] = url_type2
params = collections.OrderedDict(sorted(params.items()))
request = ''
for k,v in params.iteritems():
request += k+'='+v+'&'
sign = request.replace('&','')
sign += secret_key
md5 = hashlib.md5()
md5.update(sign)
sign = md5.hexdigest()
request = api_url + request + 'sign=' + sign
if response['errno'] != 0:
print '[!] '+response['errmsg']
exit()
print '[#] request success'
下载日志文件
SAE将每天的日志文件都打包成tar.gz的格式,下载保存下来即可,文件名以日期.tar.gz命名
log_files = list()
for down_url in response['data']:
file_name = re.compile(r'\d{4}-\d{2}-\d{2}').findall(down_url)[0] + '.tar.gz'
log_files.append(file_name)
data = urllib2.urlopen(down_url).read()
with open(file_name, "wb") as file:
file.write(data)
print '[#] you got %d log files' % len(log_files)
合并文件
合并文件方式用trafile库解压缩每个文件,然后把文件内容附加到access_log下就可以了
# compress these files to access_log
access_log = open('access_log','w');
for log_file in log_files:
tar = tarfile.open(log_file)
log_name = tar.getnames()[0]
tar.extract(log_name)
# save to access_log
data = open(log_name).read()
access_log.write(data)
os.remove(log_name)
print '[#] all file has writen to access_log'
完整代码
#!/usr/bin/env python
# -*- coding: utf-8 -*-
# @Author: Su Yan
# @Date: 2014-01-17 12:05:19
# @Last Modified by: Su Yan
# @Last Modified time: 2014-01-17 14:15:41
import os
import collections
import hashlib
import urllib2
import json
import re
import tarfile
# settings
# documents http://sae.sina.com.cn/?m=devcenter&catId=281
api_url = 'http://dloadcenter.sae.sina.com.cn/interapi.php?'
appname = 'yansublog'
from_date = '20140101'
to_date = '20140116'
url_type = 'http' # http|taskqueue|cron|mail|rdc
url_type2 = 'access' # only when type=http access|debug|error|warning|notice|resources
secret_key = 'zwzim4zhk35i50003kz2lh3hyilz01m03515j0i5'
# encode request
params = dict()
params['act'] = 'log'
params['appname'] = appname
params['from'] = from_date
params['to'] = to_date
params['type'] = url_type
if url_type == 'http':
params['type2'] = url_type2
params = collections.OrderedDict(sorted(params.items()))
request = ''
for k,v in params.iteritems():
request += k+'='+v+'&'
sign = request.replace('&','')
sign += secret_key
md5 = hashlib.md5()
md5.update(sign)
sign = md5.hexdigest()
request = api_url + request + 'sign=' + sign
# request api
response = urllib2.urlopen(request).read()
response = json.loads(response)
if response['errno'] != 0:
print '[!] '+response['errmsg']
exit()
print '[#] request success'
# download and save files
log_files = list()
for down_url in response['data']:
file_name = re.compile(r'\d{4}-\d{2}-\d{2}').findall(down_url)[0] + '.tar.gz'
log_files.append(file_name)
data = urllib2.urlopen(down_url).read()
with open(file_name, "wb") as file:
file.write(data)
print '[#] you got %d log files' % len(log_files)
# compress these files to access_log
access_log = open('access_log','w');
for log_file in log_files:
tar = tarfile.open(log_file)
log_name = tar.getnames()[0]
tar.extract(log_name)
# save to access_log
data = open(log_name).read()
access_log.write(data)
os.remove(log_name)
print '[#] all file has writen to access_log'

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

AI Hentai Generator
Generate AI Hentai for free.

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics



PHP and Python have their own advantages and disadvantages, and the choice depends on project needs and personal preferences. 1.PHP is suitable for rapid development and maintenance of large-scale web applications. 2. Python dominates the field of data science and machine learning.

Python and JavaScript have their own advantages and disadvantages in terms of community, libraries and resources. 1) The Python community is friendly and suitable for beginners, but the front-end development resources are not as rich as JavaScript. 2) Python is powerful in data science and machine learning libraries, while JavaScript is better in front-end development libraries and frameworks. 3) Both have rich learning resources, but Python is suitable for starting with official documents, while JavaScript is better with MDNWebDocs. The choice should be based on project needs and personal interests.

Docker uses Linux kernel features to provide an efficient and isolated application running environment. Its working principle is as follows: 1. The mirror is used as a read-only template, which contains everything you need to run the application; 2. The Union File System (UnionFS) stacks multiple file systems, only storing the differences, saving space and speeding up; 3. The daemon manages the mirrors and containers, and the client uses them for interaction; 4. Namespaces and cgroups implement container isolation and resource limitations; 5. Multiple network modes support container interconnection. Only by understanding these core concepts can you better utilize Docker.

When installing PyTorch on CentOS system, you need to carefully select the appropriate version and consider the following key factors: 1. System environment compatibility: Operating system: It is recommended to use CentOS7 or higher. CUDA and cuDNN:PyTorch version and CUDA version are closely related. For example, PyTorch1.9.0 requires CUDA11.1, while PyTorch2.0.1 requires CUDA11.3. The cuDNN version must also match the CUDA version. Before selecting the PyTorch version, be sure to confirm that compatible CUDA and cuDNN versions have been installed. Python version: PyTorch official branch

In VS Code, you can run the program in the terminal through the following steps: Prepare the code and open the integrated terminal to ensure that the code directory is consistent with the terminal working directory. Select the run command according to the programming language (such as Python's python your_file_name.py) to check whether it runs successfully and resolve errors. Use the debugger to improve debugging efficiency.

Python excels in automation, scripting, and task management. 1) Automation: File backup is realized through standard libraries such as os and shutil. 2) Script writing: Use the psutil library to monitor system resources. 3) Task management: Use the schedule library to schedule tasks. Python's ease of use and rich library support makes it the preferred tool in these areas.

VS Code extensions pose malicious risks, such as hiding malicious code, exploiting vulnerabilities, and masturbating as legitimate extensions. Methods to identify malicious extensions include: checking publishers, reading comments, checking code, and installing with caution. Security measures also include: security awareness, good habits, regular updates and antivirus software.

CentOS Installing Nginx requires following the following steps: Installing dependencies such as development tools, pcre-devel, and openssl-devel. Download the Nginx source code package, unzip it and compile and install it, and specify the installation path as /usr/local/nginx. Create Nginx users and user groups and set permissions. Modify the configuration file nginx.conf, and configure the listening port and domain name/IP address. Start the Nginx service. Common errors need to be paid attention to, such as dependency issues, port conflicts, and configuration file errors. Performance optimization needs to be adjusted according to the specific situation, such as turning on cache and adjusting the number of worker processes.
