The Python version I am using is 3.5.2
. I try to use the zipfile
module’s zipfile.ZipFile.open
method to open a text file in a compressed package. Even if the rU
parameter required in the document is used, it is still opened in binary data format, which is puzzling.
Code:
>>> import zipfile
>>> zf = zipfile.ZipFile('/Users/chiqingjun/Downloads/top-1m.csv.zip')
>>> zf.namelist()
['top-1m.csv']
>>> f = zf.open(zf.namelist()[0], mode='rU')
>>> f
<zipfile.ZipExtFile name='top-1m.csv' mode='rU' compress_type=deflate>
>>> f.readline()
b'1,google.com\n'
# 仍然是二进制数据
Official documentation (version 3.5.2):
In fact, the final output binary has nothing to do with
zipfile
, but is related topy3.5
. You can decode the output result to get the character typeThe documentation has said that
rU
is theuniversal newline character
, and this mode will be removed in 3.6.It is necessary to read the byte content of compressed files in binary. How to transcode later is decided by the programmer.