Using Python web document processing script example code
嵌入式web服务器不同于传统服务器,web需要转换成数组格式保存在flash中,才方便lwip网络接口的调用,最近因为业务需求,需要频繁修改网页,每次的压缩和转换就是个很繁琐的过程,因此我就有了利用所掌握的知识,利用python编写个能够批量处理网页文件,压缩并转换成数组的脚本。
脚本运行背景(后续版本兼容):
Python 3.5.1(下载、安装、配置请参考网上教程)
node.js v4.4.7, 安装uglifyjs管理包,支持js文件非文本压缩
uglifyjs 用来压缩JS文件的引擎
具体实现代码如下:
#/usr/bin/python import os import binascii import shutil from functools import partial import re import gzip #创建一个新文件夹 def mkdir(path): path=path.strip() isExists=os.path.exists(path) #判断文件夹是否存在,不存在则创建 if not isExists: os.makedirs(path) print(path+' 创建成功') else: pass return path #删除一个文件夹(包含内部所有文件) def deldir(path): path = path.strip() isExists=os.path.exists(path) #判断文件夹是否存在,存在则删除 if isExists: shutil.rmtree(path) print(path + "删除成功") else: pass #网页一次压缩文件 def FileReduce(inpath, outpath): infp = open(inpath, "r", encoding="utf-8") outfp = open(outpath, "w", encoding="utf-8") for li in infp.readlines(): if li.split(): #去除多余的\r \n li = li.replace('\n', '').replace('\t', ''); #空格只保留一个 li = ' '.join(li.split()) outfp.writelines(li) infp.close() outfp.close() print(outpath+" 压缩成功") #shell命令行调用(用ugllifyjs来压缩js文件) def ShellReduce(inpath, outpath): Command = "uglifyjs "+inpath+" -m -o "+outpath print(Command) os.system(Command) #gzip压缩模块 def FileGzip(inpath, outpath): with open(inpath, 'rb') as plain_file: with gzip.open(outpath, 'wb') as zip_file: zip_file.writelines(plain_file) print(outpath+" gzip-压缩成功") #将文件以二进制读取, 并转化成数组保存 def FileHex(inpath, outpath): i = 0 count = 0 a = '' inf = open(inpath, 'rb'); outf = open(outpath, 'w') records = iter(partial(inf.read, 1), b'') for r in records: r_int = int.from_bytes(r, byteorder='big') a += strzfill(hex(r_int), 2, 2) + ', ' i += 1 count += 1 if i == 16: a += '\n' i = 0 a = "const static char " + outpath.split('.')[-2].split('/')[-1] + "["+ str(count) +"]={\n" + a + "\n}\n\n" outf.write(a) inf.close() outf.close() print(outpath + " 转换成数组成功") #在指定位置填充0 def strzfill(istr, index, n): return istr[:index] + istr[index:].zfill(n) #去css注释 /*.....*/ def unCommentReduce(inpath, outpath): infp = open(inpath, "r", encoding="utf-8") outfp = open(outpath, "w", encoding="utf-8") fileByte = infp.read(); replace_reg = re.compile('/\*[\s\S]*?\*/') fileByte = replace_reg.sub('', fileByte) fileByte = fileByte.replace('\n', '').replace('\t', ''); fileByte = ' '.join(fileByte.split()) outfp.write(fileByte) infp.close() outfp.close() print(outpath+"去注释 压缩成功!") #程序处理主函数 def WebProcess(path): #原网页 ..\basic\ #压缩网页 ..\reduce\ #gzip二次压缩 ..\gzip #编译完成.c网页 ..\programe BasicPath = path + "\\basic" ReducePath = path + "\\reduce" GzipPath = path + "\\gzip" ProgramPath = path + "\\program" #删除原文件夹,再创建新文件夹 deldir(ProgramPath) deldir(ReducePath) deldir(GzipPath) mkdir(ProgramPath) for root, dirs, files in os.walk(BasicPath): for item in files: ext = item.split('.') InFilePath = root + "/" + item OutReducePath = mkdir(root.replace("basic", "reduce")) + "/" + item OutGzipPath = mkdir(root.replace("basic", "gzip")) + "/" + item + '.gz' OutProgramPath = ProgramPath + "/" + item.replace('.', '_') + '.c' #根据后缀不同进行相应处理 #html 去除'\n','\t', 空格字符保留1个 #css 去除\*......*\注释数据、'\n'和'\t', 同时空格字符保留1个 #js 调用uglifyjs2进行压缩 #gif jpg ico 直接拷贝 #其它 直接拷贝 #上述执行完毕后压缩成.gz文件 #除其它外,剩余文件同时转化成16进制数组, 保存为.c文件 if ext[-1] == 'html': FileReduce(InFilePath, OutReducePath) FileGzip(OutReducePath, OutGzipPath) FileHex(OutGzipPath, OutProgramPath) elif ext[-1] == 'css': unCommentReduce(InFilePath, OutReducePath) FileGzip(OutReducePath, OutGzipPath) FileHex(OutGzipPath, OutProgramPath) elif ext[-1] == 'js': ShellReduce(InFilePath, OutReducePath) FileGzip(OutReducePath, OutGzipPath) FileHex(OutGzipPath, OutProgramPath) elif ext[-1] in ["gif", "jpg", "ico"]: shutil.copy(InFilePath, OutReducePath) FileGzip(OutReducePath, OutGzipPath) FileHex(OutGzipPath, OutProgramPath) else: shutil.copy(InFilePath, OutReducePath) #获得当前路径 path = os.path.split(os.path.realpath(__file__))[0]; WebProcess(path)
上述实现的原理主要包含:
1.遍历待处理文件夹(路径为..\basic,需要用户创建,并将处理文件复制到其中,并将脚本放置到该文件夹上一层)--WebProcess
2.创建压缩页面文件夹(..\reduce, 用于存储压缩后文件), 由脚本完成,处理动作:
htm: 删除文本中的多余空格,换行符
css: 删除文本中的多余空格,换行符及注释文件/*......*/
js:调用uglifyjs进行压缩处理
gif, jpg, ico和其它: 直接进行复制处理
3.创建gzip文件处理文件夹(..\gzip, 用于保存二次压缩后文件), 由脚本调用gzip模块完成。
4.创建处理页面文件夹(..\program, 用于存储压缩后文件), 由脚本完成,处理动作:
在文件夹下(shift+鼠标右键)启用windows命令行,并输入python web.py, 就可以通过循环重复这三个过程就可以完成所有文件的处理。
特别注意:所有处理的文件需要以utf-8格式存储,否则读取时会报"gbk"读取错误。
实现效果如下图
html文件:
转换数组:
另外附送一个小的脚本,查询当前目录及子文件夹下选定代码行数和空行数(算是写这个脚本测试时衍生出来的):
#/usr/bin/python import os total_count = 0; empty_count = 0; def CountLine(path): global total_count global empty_count tempfile = open(path) for lines in tempfile: total_count += 1 if len(lines.strip()) == 0: empty_count += 1 def TotalLine(path): for root, dirs, files in os.walk(path): for item in files: ext = item.split('.') ext = ext[-1] if(ext in ["cpp", "c", "h", "java", "php"]): subpath = root + "/" + item CountLine(subpath) path = os.path.split(os.path.realpath(__file__))[0]; TotalLine(path) print("Input Path:", path) print("total lines: ",total_count) print("empty lines: ",empty_count) print("code lines: ", (total_count-empty_count))
The above is the detailed content of Using Python web document processing script example code. For more information, please follow other related articles on the PHP Chinese website!

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

AI Hentai Generator
Generate AI Hentai for free.

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics

The speed of mobile XML to PDF depends on the following factors: the complexity of XML structure. Mobile hardware configuration conversion method (library, algorithm) code quality optimization methods (select efficient libraries, optimize algorithms, cache data, and utilize multi-threading). Overall, there is no absolute answer and it needs to be optimized according to the specific situation.

An application that converts XML directly to PDF cannot be found because they are two fundamentally different formats. XML is used to store data, while PDF is used to display documents. To complete the transformation, you can use programming languages and libraries such as Python and ReportLab to parse XML data and generate PDF documents.

It is impossible to complete XML to PDF conversion directly on your phone with a single application. It is necessary to use cloud services, which can be achieved through two steps: 1. Convert XML to PDF in the cloud, 2. Access or download the converted PDF file on the mobile phone.

There is no built-in sum function in C language, so it needs to be written by yourself. Sum can be achieved by traversing the array and accumulating elements: Loop version: Sum is calculated using for loop and array length. Pointer version: Use pointers to point to array elements, and efficient summing is achieved through self-increment pointers. Dynamically allocate array version: Dynamically allocate arrays and manage memory yourself, ensuring that allocated memory is freed to prevent memory leaks.

To generate images through XML, you need to use graph libraries (such as Pillow and JFreeChart) as bridges to generate images based on metadata (size, color) in XML. The key to controlling the size of the image is to adjust the values of the <width> and <height> tags in XML. However, in practical applications, the complexity of XML structure, the fineness of graph drawing, the speed of image generation and memory consumption, and the selection of image formats all have an impact on the generated image size. Therefore, it is necessary to have a deep understanding of XML structure, proficient in the graphics library, and consider factors such as optimization algorithms and image format selection.

Use most text editors to open XML files; if you need a more intuitive tree display, you can use an XML editor, such as Oxygen XML Editor or XMLSpy; if you process XML data in a program, you need to use a programming language (such as Python) and XML libraries (such as xml.etree.ElementTree) to parse.

XML can be converted to images by using an XSLT converter or image library. XSLT Converter: Use an XSLT processor and stylesheet to convert XML to images. Image Library: Use libraries such as PIL or ImageMagick to create images from XML data, such as drawing shapes and text.

XML formatting tools can type code according to rules to improve readability and understanding. When selecting a tool, pay attention to customization capabilities, handling of special circumstances, performance and ease of use. Commonly used tool types include online tools, IDE plug-ins, and command-line tools.
