Table of Contents
0x01 Article Background
0x02 Use python to delete files
0x03 windows 文件系统关于长路径文件的相关定义
0x04 改造 python 程序,删除长路径文件
0x04 总结思考
0x05 参考资料
Home Backend Development Python Tutorial Step by step using Python to delete long path files under Windows

Step by step using Python to delete long path files under Windows

Apr 12, 2023 pm 01:31 PM
python windows root

0x01 Article Background

Recently, the storage of a business system of the author's company is approaching the limit, and the server will soon be unable to run, because the business system A contains multiple subsystems A1, A2, and A3. .. An, due to design reasons, the intermediate storage files of these subsystems are all stored in the same parent directory. The only difference is that the names of files and folders generated by different subsystems all start with the name of the subsystem. For example, the files generated by the A1 subsystem are all named A1xxxxxx​, and the file names generated by the A2 subsystem are all A2xxxxx. Now we need to delete the historical files of some of these subsystems to free up server space. Dozens of terabytes of data are stored together. Manual deletion will definitely not be displayed. We can only use program automation to achieve it. What should we use? Naturally I thought of python. In fact, I don’t think the need to simply delete files is worthy of a long discussion, but I encountered some special and interesting problems and some interesting solutions, so I would like to share them with you, such as deleting very long files under the Windows system, such as from Read the official English documentation to find solutions, etc. Let’s get to the point.

0x02 Use python to delete files

There are many ways to delete files using python. The most direct and convenient way is to call the built-in function:

  • os.remove () Delete files
  • os.rmdir() Delete an empty folder
  • shutil.rmtree() Delete a folder and all contents under the folder (including subdirectories and files)

In other words, the core of the solution to this problem is to deal with the above three functions. Turning to the problem we encountered, business system A contains multiple subsystems A1, A2, A3... An. Due to design reasons, the intermediate storage files of these subsystems are all stored in the same parent directory. The only difference is Yes, the names of files and folders generated by different subsystems all start with the subsystem name. For example, the files generated by the A1 subsystem are all named A1xxxxxx​, and the file names generated by the A2 subsystem are all A2xxxxx. The purpose now is to delete the files generated by the specified subsystem and retain the files of other subsystems.

Breaking down the requirements actually solves the following four problems:

1. How to delete a file?

2. How to identify that a file or folder is generated by a certain subsystem?

3. How to determine whether a path is a file or a directory?

4. How to locate files and folders generated by all specified subsystems?

For question 1, as explained at the beginning of this section, you can use python's built-in function to delete:

os.remove("path") # 删除指定文件
os.rmdir("path") # 删除一个空文件夹
shutil.rmtree("path") #删除一个文件夹及该文件夹下所有内容(包括子目录及文件)
Copy after login

For question 2, files and folders generated by specific subsystems The naming methods are all fixed patterns. For example, the file names generated by the A1 subsystem are all A1xxxxx, so they can be identified through keyword matching. One possible way is:

if keywords in filepath: # 如果文件名包含关键字keywords
os.remove(filepath) # 删除文件
else:
pass
Copy after login

For question 3, since the methods of deleting directories and deleting files are inconsistent, it is necessary to determine whether a path is a directory or a file before deleting, and select the appropriate deletion method according to its type. This can be determined in python using functions such as **os.path.isdir()**, mainly the following functions:

os.path.isdir("path") # 返回true则为目录,false则为文件
os.path.isfile("path") # 返回true则为文件,false则为目录
Copy after login

For question 4, how to locate all files to be deleted, this question In fact, it is a problem of traversing files in a specified directory, that is, how to traverse all the folders and files in a specified directory. For this problem, there are generally two solutions, one is the depth-first traversal method, and the other is the breadth-first traversal method. In this example, the efficiency of the two methods is the same, because we must eventually traverse all files. In addition, fortunately, python is too powerful. Its built-in functions have helped us implement a breadth-first directory traversal method and the os.walk("path") method, which is to traverse all files in the path directory. and folders, a typical usage is as follows:

import os

path = "C:\A\"

for root, dirs, files in os.walk(path):
print(root)
print(dirs)
print(files)
Copy after login

In the above example, root represents the currently traversed path, dirs represents all subdirectories under the current path, and files represents all subfiles under the current path. In this way, all specified directories can be traversed.

The problems have been decomposed. Let’s combine the problems to complete the code implementation.

The final code implementation is:

import os
import shutil

path = "C:\A\"
keyword = "A1"

for root, dirs, files in os.walk(path):
for dir in dirs:
if keyword in dir:
rmpath = os.path.join(root, dir)
print("删除文件夹: %s" % rmpath)
shutil.rmtree(rmpath)
for file in files:
if keyword in file:
rmpath = os.path.join(root, file)
print("删除文件: %s" % rmpath)
os.remove(rmpath)
Copy after login

That is, through the breadth first method (os. walk()) traverses the specified directory and determines one by one whether all subdirectories and files in the directory meet the keyword conditions, and deletes them if they do.

The running effect is:

Step by step using Python to delete long path files under Windows

It seems that the requirements are basically solved at this point, but in the actual test, some very deep directories were found But it was not deleted. An error occurred when deleting the directory. The error description is as follows:

Unexpected error: (< type 'exceptions.WindowsError'>, WindowsError(3, 'The system cannot find the path specified'), < traceback object at 0x0000000002714F88>)
Copy after login

大致意思就是python找不到这个路径,可是为什么呢?为此,我继续进行一番资料查询,后来大致定位了是由于文件路径过长导致的,是由于windows系统用户态的默认路径长度不能超过256个字节导致的。但是官方说256个字节是最长,但为何能创建超过256的呢,所以既然能创建,那就一定能删除,但是需要一些方法,经过一番学习,找到了好几种方法,下面介绍其中一种最为实用的方法,另外几个比如使用压缩软件压缩后删除(百度知道的结果)适合手动但不适合编程解决。这个方法在下一节中继续讲述。

0x03 windows 文件系统关于长路径文件的相关定义

为解决windows下的长文件删除的问题,最为权威的资料莫过于windows官方的描述,我阅读了微软关于文件名长度的这一块的定义及说明,找到解决方案,微软的原文如下:

Step by step using Python to delete long path files under Windows

关键意思如下:

1.Windows API 提供的文件路径理论上最长是 32767 个字节,普通状态下给用户使用是不超过256个字符,说是为了使用户操作更加方便。这里不得不吐槽一下了,确实操作方便了,但是方便的同时也可能带来不便,明明定义了32767这么长的字节,只给用256,未免太抠搜了一点

2.用户如果想要打破这个长度限制,可以通过一个特殊方式告诉windows系统自己想要使用超长文件,这个特殊的方式就是在绝对路径前加上** "?" **字符串。

3.这篇文档后面还有描述在windows10以后如何通过注册表的方式接触文件名长度限制,这里就没有截图了,因为不通用,win7怎么办呢?有兴趣的同学可以查看其原文链接阅读:https://docs.microsoft.com/en-US/windows/win32/fileio/maximum-file-path-limitation?tabs=cmd

好了,看到这,解决方法呼之欲出,其实简单得不能太简单,直接在绝对路径前加上一个"?"即可:

# 获取目标路径的绝对路径,并在路径前加上\?,
# 以解除windows的文件长度限制
path = '\\?\' + os.path.abspath(path)
Copy after login

0x04 改造 python 程序,删除长路径文件

根据上一节,对python程序进一步进行改造,加入windows长文件名限制解除,最后的完美删除工具就成型了:

import os
import shutil

path = "C:\A\"
keyword = "A1"

# 获取目标路径的绝对路径,并在路径前加上\?,
# 以解除windows的文件长度限制
path = '\\?\' + os.path.abspath(path)

for root, dirs, files in os.walk(path):
for dir in dirs:
if keyword in dir:
rmpath = os.path.join(root, dir)
print("删除文件夹: %s" % rmpath)
shutil.rmtree(rmpath)
for file in files:
if keyword in file:
rmpath = os.path.join(root, file)
print("删除文件: %s" % rmpath)
os.remove(rmpath)
Copy after login

虽然代码很短,只添加了一行,但是这一行,却完成了一个超级核心的任务,真可谓是灵魂一行啊,最后该工具中如在生产环境中发挥了其出色的作用,使服务器继续运转如飞了。

0x04 总结思考

啰嗦的话就不多说了,说几点思考 :

1.遇到问题将问题进行分解,拆分成一个个小问题逐步击破 。

2.要善于阅读官方技术文档,有时候解决一个问题的核心可能很简单,代码可能也就一行两行,但是就是藏在某个角落,不仔细去阅读还真不一定找得出来 。

3.python是个好东西,要有将问题转化成使用python去解决的习惯,习惯成自然,python可能在工作中就发挥大作用了呢。

0x05 参考资料

1.https://docs.microsoft.com/en-US/windows/win32/fileio/maximum-file-path-limitation?tabs=cmd  

2.https://stackoverflow.com/questions/6996603/how-to-delete-a-file-or-folder-in

The above is the detailed content of Step by step using Python to delete long path files under Windows. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

AI Hentai Generator

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)
4 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. Best Graphic Settings
4 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. How to Fix Audio if You Can't Hear Anyone
4 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
WWE 2K25: How To Unlock Everything In MyRise
1 months ago By 尊渡假赌尊渡假赌尊渡假赌

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

The 2-Hour Python Plan: A Realistic Approach The 2-Hour Python Plan: A Realistic Approach Apr 11, 2025 am 12:04 AM

You can learn basic programming concepts and skills of Python within 2 hours. 1. Learn variables and data types, 2. Master control flow (conditional statements and loops), 3. Understand the definition and use of functions, 4. Quickly get started with Python programming through simple examples and code snippets.

How to read redis queue How to read redis queue Apr 10, 2025 pm 10:12 PM

To read a queue from Redis, you need to get the queue name, read the elements using the LPOP command, and process the empty queue. The specific steps are as follows: Get the queue name: name it with the prefix of "queue:" such as "queue:my-queue". Use the LPOP command: Eject the element from the head of the queue and return its value, such as LPOP queue:my-queue. Processing empty queues: If the queue is empty, LPOP returns nil, and you can check whether the queue exists before reading the element.

How to start the server with redis How to start the server with redis Apr 10, 2025 pm 08:12 PM

The steps to start a Redis server include: Install Redis according to the operating system. Start the Redis service via redis-server (Linux/macOS) or redis-server.exe (Windows). Use the redis-cli ping (Linux/macOS) or redis-cli.exe ping (Windows) command to check the service status. Use a Redis client, such as redis-cli, Python, or Node.js, to access the server.

Where is the Redis restart service Where is the Redis restart service Apr 10, 2025 pm 02:36 PM

How to restart the Redis service in different operating systems: Linux/macOS: Use the systemctl command (systemctl restart redis-server) or the service command (service redis-server restart). Windows: Use the services.msc tool (enter "services.msc" in the Run dialog box and press Enter) and right-click the "Redis" service and select "Restart".

Python vs. C  : Applications and Use Cases Compared Python vs. C : Applications and Use Cases Compared Apr 12, 2025 am 12:01 AM

Python is suitable for data science, web development and automation tasks, while C is suitable for system programming, game development and embedded systems. Python is known for its simplicity and powerful ecosystem, while C is known for its high performance and underlying control capabilities.

How to read data from redis How to read data from redis Apr 10, 2025 pm 07:30 PM

To read data from Redis, you can follow these steps: 1. Connect to the Redis server; 2. Use get(key) to get the value of the key; 3. If you need string values, decode the binary value; 4. Use exists(key) to check whether the key exists; 5. Use mget(keys) to get multiple values; 6. Use type(key) to get the data type; 7. Redis has other read commands, such as: getting all keys in a matching pattern, using cursors to iterate the keys, and sorting the key values.

What types of files are composed of oracle databases? What types of files are composed of oracle databases? Apr 11, 2025 pm 03:03 PM

Oracle database file structure includes: data file: storing actual data. Control file: Record database structure information. Redo log files: record transaction operations to ensure data consistency. Parameter file: Contains database running parameters to optimize performance. Archive log file: Backup redo log file for disaster recovery.

How to find keys with redis How to find keys with redis Apr 10, 2025 pm 05:45 PM

There are several ways to find keys in Redis: Use the SCAN command to iterate over all keys by pattern or condition. Use GUI tools such as Redis Explorer to visualize the database and filter keys by name or schema. Write external scripts to query keys using the Redis client library. Subscribe to keyspace notifications to receive alerts when key changes.

See all articles