links = sel.xpath('//i[contains(@title,"置顶")]/following-sibling::a/@href').extract()
报错:ValueError: All strings must be XML compatible: Unicode or ASCII, no NULL bytes or control characters
光阴似箭催人老,日月如移越少年。
参见文章:解决Scrapy中xpath用到中文报错问题
方法一:将整个xpath语句转成Unicode
links = sel.xpath(u'//i[contains(@title,"置顶")]/following-sibling::a/@href').extract()
方法二:xpath语句用已转成Unicode的title变量
title = u"置顶" links = sel.xpath('//i[contains(@title,"%s")]/following-sibling::a/@href' %(title)).extract()
方法三:直接用xpath中变量语法($符号加变量名)$title, 传参title即可
$
$title
links = sel.xpath('//i[contains(@title,$title)]/following-sibling::a/@href', title="置顶").extract()
整个字符串前加个u试试
参见文章:解决Scrapy中xpath用到中文报错问题
解决方法
方法一:将整个xpath语句转成Unicode
方法二:xpath语句用已转成Unicode的title变量
方法三:直接用xpath中变量语法(
$
符号加变量名)$title
, 传参title即可整个字符串前加个u试试