os.walk generator
os.walk(PATH), PATH is a folder path, of course you can use it. Or.../this way.
What is returned is a list of triplet elements, each element represents the content of a folder. The first one is the content of the current folder.
The returned triple represents (the working folder, the list of folders under this folder, the list of files under this folder).
So,
Get all subfolders, that is (d represents this triplet):
os.path.join(d[0],d[1]);
Get all sub-files, that is:
os.path.join(d[0],d[2]);
The following example uses two sets of loops. After traversing, a list of all file names is obtained and then all files are looped:
result = [os.path.join(dp, f) for dp, dn, fs in os.walk("_pages") for f in fs if os.path.splitext(f)[1] == '.html'] for fname in result: #do something
actually equals
result=[] for dp, dn, fs in os.walk("_pages"): for f in fs: if (os.path.splitext(f)[1] == '.html'): result.append(os.path.join(dp, f)) for fname in result: #do something
Finally determine whether the html suffix is used to obtain the file name. You can also use glob:
result = [y for x in os.walk(PATH) for y in glob.glob(os.path.join(x[0], '*.txt'))]
You can also use iterator methods:
from itertools import chain import glob result = (chain.from_iterable(glob.iglob(os.path.join(x[0], '*.txt')) for x in os.walk('.')))
Advanced
The standard file number traversal generator os.walk is both powerful and flexible. However, os.walk still lacks some detailed processing capabilities required by applications, such as selecting files according to a certain pattern and performing operations on all files (or directories). Sorting, or only traversing the current directory without entering its subdirectories, so the interface needs to be encapsulated.
import os, fnmatch def filter_files(dirname, patterns='*', single_level=False, yield_folders=False): patterns = patterns.split(';') allfiles = [] for rootdir, subdirname, files in os.walk(dirname): print subdirname allfiles.extend(files) if yield_folders: allfiles.extend(dubdirname) if single_level: break allfiles.sort() for eachpattern in patterns: for eachfile in fnmatch.filter(allfiles, eachpattern): print os.path.normpath(eachfile)
Description:
1.The difference between extend and append
Lists are implemented as classes. "Creating" a list actually instantiates a class. Therefore, lists can be manipulated in multiple ways. Lists can contain elements of any data type, and the elements in a single list do not need to be all of the same type. The append() method adds a new element to the end of the list. Accepting only one parameter, the extend() method only accepts a list as a parameter and adds each element of the parameter to the original list.
2. fnmatch module
The fnmatch module uses patterns to match file names. The pattern syntax is the same as that used in Unix shells. An asterisk (*) matches zero or more characters, and a question mark (?) matches a single character. You can also use square brackets to specify a character range, for example [0-9] represents a number, and all other characters match themselves.
1) fnmatch.fnmatch(name, pattern) method: tests whether name matches pattern and returns true/false
2) fnmatch.filter(names, pat) implements filtering or filtering of special characters in the list and returns a list of characters that match the matching pattern. Of course, names represents the list