python文档(传送门)关于CSV的一个用法示例:
文件打开的mode是“wb”
with open('rent.csv','wb') as csv_file:
且与Pythone3里面字符串和二进制数据是两种类型,所以要将str类型转换成bytes类型
#把str类型的housetitle、house_location、house_money编码成bytes类型
house_title = house_title.encode("utf8")
house_location = house_location.encode("utf8")
house_money = house_money.encode("utf8")
house_url = house_url.encode("utf8")
#查看house_title等的类型
print(type(house_title),type(house_location),type(house_money),type(house_url))
# 向csv文件写入数据
with open('rent.csv','wb') as csv_file:
csv_writer = csv.writer(csv_file,delimiter=',')
csv_writer.writerow([house_title, house_location, house_money, house_url])
错误提示
可以看到这里输出的house_title, house_location, house_money, house_url类型都是bytes
然而下面还是报了类型错误
from bs4 import BeautifulSoup
from urllib.parse import urljoin
import requests
import csv
url = "http://nj.58.com/pinpaigongyu/pn/{page}/?minprice=1000_1500"
page = 1
print("fetch: ", url.format(page=page))
# 抓取目标页面
response = requests.get(url.format(page=page))
# 创建一个BeautifulSoup对象
html = BeautifulSoup(response.text, "lxml")
# 获取class=list下的所有li元素
house_list = html.select(".list > li")
for house in house_list:
house_title = house.select("h2")[0].string
house_url = urljoin(url, house.select("a")[0]["href"])
house_info_list = house_title.split()
house_location = house_info_list[1]
house_money = house.select(".money")[0].select("b")[0].string
#把str类型的housetitle、house_location、house_money编码成bytes类型
house_title = house_title.encode("utf8")
house_location = house_location.encode("utf8")
house_money = house_money.encode("utf8")
house_url = house_url.encode("utf8")
#查看house_title等的类型
print(type(house_title),type(house_location),type(house_money),type(house_url))
# 向csv文件写入数据
with open('rent.csv','wb') as csv_file:
csv_writer = csv.writer(csv_file,delimiter=',')
csv_writer.writerow([house_title, house_location, house_money, house_url])
#用with的写法就不用写关闭文件的csv_file.close()语句了
先看了大概,问题很多~
#csv_file = open("rent.csv","wb")
# 这句删除,重复了#csv_file = open("rent.csv","wb")
# 这句删除,重复了更新一点
csv 是文本格式的文件,不支持二进制的写入,所以不要用二进制模式打开文件,数据也不必转成bytes。
再更
根本原因是楼主看错文档,导致了理解有误~
rrreee更新一点
csv 是文本格式的文件,不支持二进制的写入,所以不要用二进制模式打开文件,数据也不必转成bytes。🎜再更
🎜根本原因是楼主看错文档,导致了理解有误~
🎜open参数 'wb'改成'w'
python2.x要用’wb‘模式写入的真正原因
python2.x中写入CSV时,CSV文件的创建必须加上'b'参数,即csv.writer(open('test.csv','wb')),不然会出现隔行的现象。网上搜到的解释是:python正常写入文件的时候,每行的结束默认添加'n’,即0x0D,而writerow命令的结束会再增加一个0x0D0A,因此对于windows系统来说,就是两行,而采用’ b'参数,用二进制进行文件写入,系统默认是不添加0x0D的
而python3.x中换成采用newline=''这一参数来达到这一目的