Detailed explanation of examples of common commands used to access and crawl web pages in Python-Python Tutorial-php.cn

Detailed explanation of examples of common commands used to access and crawl web pages in Python

Y2J

Release： 2017-04-25 09:22:13

Original

2097 people have browsed it

This article mainly introduces relevant information about common commands for python to access and crawl web pages. Friends who need it can refer to

Common commands for python to access and crawl web pages

Simple crawling of web pages:

import urllib.request  
url="http://google.cn/" 
response=urllib.request.urlopen(url)  #返回文件对象
page=response.read()

Copy after login

Save the URL directly as a local file:

import urllib.request  
url="http://google.cn/" 
response=urllib.request.urlopen(url)  #返回文件对象
page=response.read()

Copy after login

POST method:

import urllib.parse 
import urllib.request 
url="http://liuxin-blog.appspot.com/messageboard/add" 
values={"content":"命令行发出网页请求测试"} 
data=urllib.parse.urlencode(values) 

#创建请求对象 
req=urllib.request.Request(url,data) 
#获得服务器返回的数据 
response=urllib.request.urlopen(req) 
#处理数据 
page=response.read()

Copy after login

GET method:

import urllib.parse 
import urllib.request 
url="http://www.google.cn/webhp" 
values={"rls":"ig"} 
data=urllib.parse.urlencode(values) 
theurl=url+"?"+data 
#创建请求对象 
req=urllib.request.Request(theurl) 
#获得服务器返回的数据 
response=urllib.request.urlopen(req) 
#处理数据 
page=response.read()

Copy after login

There are two commonly used methods, geturl(), info()

geturl() is set to Identify whether there is a server-side URL redirection, and info() contains a series of information.

To handle Chinese problems, encode() encoding and dencode() decoding will be used:

The above is the detailed content of Detailed explanation of examples of common commands used to access and crawl web pages in Python. For more information, please follow other related articles on the PHP Chinese website!