Cookie refers to the data (usually encrypted) stored on the user's local terminal by some websites in order to identify the user's identity and perform session tracking. For example, some websites require logging in before you can access a certain page. Before logging in, you want to capture the content of a certain page. The content before logging in is different from that after logging in, or it may not be allowed.
In python it provides us with the cookiejar module, which is located in the http package and is used to support Cookie. Through it, we can capture the cookie and resend it on subsequent connection requests, for example, we can implement the simulated login function. The main objects of this module are CookieJar, FileCookieJar, MozillaCookieJar, and LWPCookieJar.
How to obtain cookies
## cookie的获取 # -*- coding: UTF-8 -*- from urllib import request from http import cookiejar if __name__ == '__main__': #声明一个CookieJar对象实例来保存cookie cookie = cookiejar.CookieJar() #利用urllib.request库的HTTPCookieProcessor对象来创建cookie处理器,也就CookieHandler handler=request.HTTPCookieProcessor(cookie) #通过CookieHandler创建opener opener = request.build_opener(handler) #此处的open方法打开网页 response = opener.open('http://www.baidu.com') #打印cookie信息 for item in cookie: print('Name = %s' % item.name) print('Value = %s' % item.value)
For more Python-related technical articles, please visit the Python Tutorial column to learn!
The above is the detailed content of How does python crawler get cookies. For more information, please follow other related articles on the PHP Chinese website!