1.open
After using open to open a file, you must remember to call the close() method of the file object. For example, you can use the try/finally statement to ensure that the file can be closed finally.
file_object = open('thefile.txt')
try:
all_the_text = file_object.read( )
finally:
file_object.close( )
Note: You cannot put the open statement in the try block. Because when an exception occurs when opening a file, the file object file_object cannot execute the close() method.
2. Read files
Read text files
input = open('data', 'r')
#The second parameter defaults to r
input = open('data')
read Binary file
input = open('data', 'rb')
Read all contents
file_object = open('thefile.txt')
try:
all_the_text = file_object.read( )
finally:
file_object.close( )
Read fixed bytes
file_object = open('abinfile', 'rb')
try:
while True:
chunk = file_object.read( 100 )
if not chunk:
Break
do_something_with (chunk)
finally:
file_object.close ()
Read per line
If the file is a text file , you can also directly traverse the file object to get each line:
for line in file_object:
Process line
3. Write file
Write text file
output = open('data', 'w')
Write binary file
output = open('data', 'wb')
Append write file
output = open('data', 'w+')
Write data
file_object = open('thefile.txt', 'w')
file_object.write(all_the_text)
file_object.close( )
Write multiple lines
file_object.writelines(list_of_text_strings)
Note that call writelines Writing multiple rows will have higher performance than writing at once using write.
When processing log files, we often encounter such a situation: the log file is huge and it is impossible to read the entire file into the memory for processing at one time. For example, it needs to be processed on a machine with a physical memory of 2GB. For a 2GB log file, we may want to process only 200MB of its contents at a time.
In Python, the built-in File object directly provides a readlines(sizehint) function to accomplish such a thing. Take the following code as an example:
file = open('test.log', 'r')sizehint = 209715200 # 200Mposition = 0lines = file.readlines(sizehint)while not file.tell() - position
Every time the readlines(sizehint) function is called, approximately 200MB of data will be returned, and the returned data must be complete line data. In most cases, the return The number of bytes of data will be slightly larger than the value specified by sizehint (except when the readlines(sizehint) function is called for the last time). Normally, Python will automatically adjust the user-specified sizehint value to an integer multiple of the internal cache size.
file is a special type in python, which is used to operate external files in python programs. Everything in python is an object, and file is no exception. File has file methods and attributes. Let’s first look at how to create a file object:
file(name[, mode[, buffering]]) The
file() function is used to create a file object. It has an alias called open(), which may be more vivid. , they are built-in functions. Let’s take a look at its parameters. Its parameters are all passed in the form of strings. name is the name of the file.
mode is the open mode, the optional values are r w a U, which represent read (default) and write respectively. Add modes that support various line breaks. If you open a file in w or a mode, if the file does not exist, it will be created automatically. In addition, when using w mode to open an existing file, the content of the original file will be cleared, because the initial file operation mark is at the beginning of the file. If you perform a write operation at this time, the original content will undoubtedly be deleted. Erase it. Due to historical reasons, the newline character has different modes in different systems. For example, in Unix, it is an n, and in Windows, it is 'rn'. Opening a file in U mode supports all newline modes, which means ' r' 'n' 'rn' can all represent newlines, and there will be a tuple used to store the newline characters used in this file. However, although there are many modes for line breaks, when reading in Python, n is used instead. After the mode character, you can also add the two signs + b t, which respectively indicate that the file can be read and written at the same time and the file can be opened in binary mode or text mode (default).
buffering If it is 0, it means no buffering; if it is 1, it means "line buffering"; if it is a number greater than 1, it means the size of the buffer, which should be in bytes.
The file object has its own properties and methods. Let’s first look at the attributes of file.
closed #Mark whether the file has been closed, rewritten by close()
encoding #File encoding
mode #Open mode
name #File name
newlines #The newline mode used in the file is a tuple
softspace #boolean type, usually 0, is said to be used for reading and writing print
file:
F.read([size]) #size is the length of the read, in bytes
F.readline([ size])
#Read a line. If size is defined, it is possible to return only a part of a line
F.readlines([size])
# Treat each line of the file as a member of a list and return this list . In fact, it is implemented internally by calling readline() in a loop. If the size parameter is provided, size represents the total length of the read content, which means that only a part of the file may be read.
F.write(str)
#Write str to the file, write() will not add a newline character after str
F.writelines(seq)
#Write all the contents of seq in the file. This function also just writes faithfully, without adding anything after each line.
Other methods of file:
F.close()
#Close the file. Python will automatically close a file after it is no longer used. However, this function is not guaranteed. It is best to develop the habit of closing it yourself. If a file is operated on after it is closed, a ValueError will be generated
F.flush()
#Write the contents of the buffer to the hard disk
F.fileno()
#Return a long integer "file" Tag "
F.isatty()
#Whether the file is a terminal device file (in a unix system)
F.tell()
#Returns the current position of the file operation mark, with the beginning of the file as the origin
F.next()
#Return to the next line and move the file operation flag to the next line. When a file is used in a statement such as for ... in file, the next() function is called to implement traversal.
F.seek(offset[,whence])
#Move the file operation mark to the offset position. This offset is generally calculated relative to the beginning of the file, and is generally a positive number. But this is not necessarily the case if the whence parameter is provided. whence can be 0 to start the calculation from the beginning, and 1 to use the current position as the origin. 2 means the calculation is performed with the end of the file as the origin. It should be noted that if the file is opened in a or a+ mode, the file operation mark will automatically return to the end of the file each time a write operation is performed.
F.truncate([size])
#Cut the file to the specified size. The default is to cut to the position of the current file operation mark. If size is larger than the file size, depending on the system, the file may not be changed, the file may be padded to the corresponding size with 0, or some random content may be added.