Python gets the status code of the HTTP request (200, 404, etc.)
欧阳克
欧阳克 2017-06-28 09:25:31
0
2
1135

Python gets the status code of the HTTP request (200, 404, etc.) without accessing the entire page source code, which would be a waste of resources:

输入:segmentfault.com 输出:200
输入:segmentfault.com/nonexistant 输出:404
欧阳克
欧阳克

温故而知新,可以为师矣。 博客:www.ouyangke.com

reply all(2)
ringa_lee

Reference article: List of practical Python scripts

http not only has the get method (requesting the header+body), but also the headmethod, which only requests the header.

import httplib

def get_status_code(host, path="/"):
    """ This function retreives the status code of a website by requesting
        HEAD data from the host. This means that it only requests the headers.
        If the host cannot be reached or something else goes wrong, it returns
        None instead.
    """
    try:
        conn = httplib.HTTPConnection(host)
        conn.request("HEAD", path)
        return conn.getresponse().status
    except StandardError:
        return None
        
print get_status_code("segmentfault.com") # prints 200
print get_status_code("segmentfault.com", "/nonexistant") # prints 404
刘奇

You use get to request the entire head+body. You can try the head method to access the header directly!

import requests
html = requests.head('http://segmentfault.com')    # 用head方法去请求资源头部
print html.status_code  # 状态码

html = requests.head('/nonexistant')   # 用head方法去请求资源头部
print html.status_code   # 状态码

# 输出:
200
404
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template