python实现代理服务功能实例-Python Tutorial-php.cn

Home

Backend Development

Python Tutorial

python实现代理服务功能实例

WBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWB

Jun 16, 2016 am 08:46 AM

agency service

代理服务原理很简单，就拿浏览器与web服务器来说。无非是A浏览器
发request给B代理，B代理再把request把送给C web服务，然后C的reponse->B->A。
要写web代理服务就要先了解下http协议，当然并不要多深入，除非要实现强大的功能：修改XX信息、
负载均衡等。http请求由三部分组成：请求行、消息报头、请求正文；
详细的网上有，想了解可以看看。下面是一个正常的GET请求头（Cookie部分本人没截屏，使用的系统w7）：

python实现代理服务功能实例

可以看到首行：GET是请求方法， /是路径，在后面是协议版本；第二行以后是请求报头，都是键值对形式；
GET方法没有正文。post有正文，除此之外，请求方法头部基本一致，每一行结尾都是\r\n。
基本的请求方法，如下：

GET        请求获取Request-URI所标识的资源
POST      在Request-URI所标识的资源后附加新的数据
HEAD      请求获取由Request-URI所标识的资源的响应消息报头
PUT         请求服务器存储一个资源，并用Request-URI作为其标识
DELETE   请求服务器删除Request-URI所标识的资源
TRACE     请求服务器回送收到的请求信息，主要用于测试或诊断
CONNECT 保留将来使用
OPTIONS 请求查询服务器的性能，或者查询与资源相关的选项和需求
但是使用代理后，从代理服务上得到的请求如下：
python实现代理服务功能实例

与第一张图片对比一下，有什么不同......第一行的资源路径不对。当浏览器上设置代理请求时把整个url都作为资源路径了，所以我们要把域名删掉，然后代理服务器在把修改后的请求发送给目标
web服务器。就这么简单，当然CONNECT方法特别，要特别对待，所以先说其他方法。
基本的思路：
1、代理服务器运行监听，当有客户端浏览器请求到来时通过accept()获得client句柄（或者叫描述符）；
2、利用client描述符接收浏览器发来的request，分离出第一行为了修改第一行和获得method，
要去掉的的部分，除去http://的部分用targetHost表示吧。
3、通过第2步能够获得方法method、request和targetHost，这一步可以根据不同的method做不同的处理，
由于GET、POET、PUT、DELETE等除了CONNECT处理基本一致，所以处理首行，比如：

复制代码代码如下:

GET http://www.a.com/ HTTP/1.1
替换为
GET / HTTP/1.1

此时targetHost也就是红色的部分，默认的请求80端口，此时port为80;如果targetHost中有端口（比如www.a.com：8081），
就要分理处端口，此时port为8081。然后根据targetHost和port连接到目标服务器target了，实现代码如下：
复制代码 代码如下:
def getTargetInfo(self,host): #处理targetHost获得网址和端口，作为返回值。
        port=0
        site=None
        if ':' in host:
            tmp=host.split(':')
            site=tmp[0]
            port=int(tmp[1])
        else:
            site=host
            port=80
        return site,port
def commonMethod(self,request): #处理除CONNECT以外的方法
        tmp=self.targetHost.split('/')
        net=tmp[0]+'//'+tmp[2]
        request=request.replace(net,'') #替换掉首行不必要的部分
        targetAddr=self.getTargetInfo(tmp[2]) #调用上面的函数
        try:
            (fam,_,_,_,addr)=socket.getaddrinfo(targetAddr[0],targetAddr[1])[0]
        except Exception as e:
            print e
            return
        self.target=socket.socket(fam)
        self.target.connect(addr) #连接到目标web服务

4、这一步就好办了，根据第三步处理后的request就可以self.target.send(request)发送给web服务器了。
5、这一步web服务器的reponse反响通过代理服务直接转发给客户端就行了，本人用了非阻塞select，可以试试epoll。
基本步骤就是这样，使用的方法函数可以改进，比如主函数部分使用的多线程或者多进程，怎样选择......
但是思路差不多都是这样啦。想测试的话，chrome安装SwitchySharp插件，设置一下，代理端口8083；
firefox插件autoproxy。
 对于connect的处理还在解决中（如果有博友帮助就更好了），所以现在这个代理程序不支持https协议。
代理服务可以获得http协议的所有信息，想了解学习http，利用代理服务器是个不错的方法。
下面附上代码
复制代码 代码如下:

#-*- coding: UTF-8 -*-
import socket,select
import sys
import thread
from multiprocessing import Process
class Proxy:
    def __init__(self,soc):
        self.client,_=soc.accept()
        self.target=None
        self.request_url=None
        self.BUFSIZE=4096
        self.method=None
        self.targetHost=None
    def getClientRequest(self):
        request=self.client.recv(self.BUFSIZE)
        if not request:
            return None
        cn=request.find('\n')
        firstLine=request[:cn]
        print firstLine[:len(firstLine)-9]
        line=firstLine.split()
        self.method=line[0]
        self.targetHost=line[1]
        return request
    def commonMethod(self,request):
        tmp=self.targetHost.split('/')
        net=tmp[0]+'//'+tmp[2]
        request=request.replace(net,'')
        targetAddr=self.getTargetInfo(tmp[2])
        try:
            (fam,_,_,_,addr)=socket.getaddrinfo(targetAddr[0],targetAddr[1])[0]
        except Exception as e:
            print e
            return
        self.target=socket.socket(fam)
        self.target.connect(addr)
        self.target.send(request)
        self.nonblocking()
    def connectMethod(self,request): #对于CONNECT处理可以添加在这里
        pass
    def run(self):
        request=self.getClientRequest()
        if request:
            if self.method in ['GET','POST','PUT',"DELETE",'HAVE']:
                self.commonMethod(request)
            elif self.method=='CONNECT':
                self.connectMethod(request)
    def nonblocking(self):
        inputs=[self.client,self.target]
        while True:
            readable,writeable,errs=select.select(inputs,[],inputs,3)
            if errs:
                break
            for soc in readable:
                data=soc.recv(self.BUFSIZE)
                if data:
                    if soc is self.client:
                        self.target.send(data)
                    elif soc is self.target:
                        self.client.send(data)
                else:
                    break
        self.client.close()
        self.target.close()
    def getTargetInfo(self,host):
        port=0
        site=None
        if ':' in host:
            tmp=host.split(':')
            site=tmp[0]
            port=int(tmp[1])
        else:
            site=host
            port=80
        return site,port
if __name__=='__main__':      
    host = '127.0.0.1' 
    port = 8083
    backlog = 5 
    server = socket.socket(socket.AF_INET,socket.SOCK_STREAM) 
    server.setsockopt(socket.SOL_SOCKET,socket.SO_REUSEADDR,1)
    server.bind((host,port)) 
    server.listen(5) 
    while True:
        thread.start_new_thread(Proxy(server).run,())
        # p=Process(target=Proxy(server).run, args=()) #多进程
        # p.start()

Statement of this Website

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress images for free

Clothoff.io

AI clothes remover

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)

2 weeks ago By 尊渡假赌尊渡假赌尊渡假赌

Repo: How To Revive Teammates

4 weeks ago By 尊渡假赌尊渡假赌尊渡假赌

Hello Kitty Island Adventure: How To Get Giant Seeds

3 weeks ago By 尊渡假赌尊渡假赌尊渡假赌

How Long Does It Take To Beat Split Fiction?

3 weeks ago By DDD

R.E.P.O. Save File Location: Where Is It & How to Protect It?

3 weeks ago By DDD

Hot Tools

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Hot Topics

Where is the login entrance for gmail email?

7316

Java Tutorial

1625

CakePHP Tutorial

1349

Laravel Tutorial

1261

PHP Tutorial

1208

Related knowledge

How to Use Python to Find the Zipf Distribution of a Text File Mar 05, 2025 am 09:58 AM

This tutorial demonstrates how to use Python to process the statistical concept of Zipf's law and demonstrates the efficiency of Python's reading and sorting large text files when processing the law. You may be wondering what the term Zipf distribution means. To understand this term, we first need to define Zipf's law. Don't worry, I'll try to simplify the instructions. Zipf's Law Zipf's law simply means: in a large natural language corpus, the most frequently occurring words appear about twice as frequently as the second frequent words, three times as the third frequent words, four times as the fourth frequent words, and so on. Let's look at an example. If you look at the Brown corpus in American English, you will notice that the most frequent word is "th

How to Download Files in Python Mar 01, 2025 am 10:03 AM

Python provides a variety of ways to download files from the Internet, which can be downloaded over HTTP using the urllib package or the requests library. This tutorial will explain how to use these libraries to download files from URLs from Python. requests library requests is one of the most popular libraries in Python. It allows sending HTTP/1.1 requests without manually adding query strings to URLs or form encoding of POST data. The requests library can perform many functions, including: Add form data Add multi-part file Access Python response data Make a request head

How Do I Use Beautiful Soup to Parse HTML? Mar 10, 2025 pm 06:54 PM

This article explains how to use Beautiful Soup, a Python library, to parse HTML. It details common methods like find(), find_all(), select(), and get_text() for data extraction, handling of diverse HTML structures and errors, and alternatives (Sel

Image Filtering in Python Mar 03, 2025 am 09:44 AM

Dealing with noisy images is a common problem, especially with mobile phone or low-resolution camera photos. This tutorial explores image filtering techniques in Python using OpenCV to tackle this issue. Image Filtering: A Powerful Tool Image filter

How to Work With PDF Documents Using Python Mar 02, 2025 am 09:54 AM

PDF files are popular for their cross-platform compatibility, with content and layout consistent across operating systems, reading devices and software. However, unlike Python processing plain text files, PDF files are binary files with more complex structures and contain elements such as fonts, colors, and images. Fortunately, it is not difficult to process PDF files with Python's external modules. This article will use the PyPDF2 module to demonstrate how to open a PDF file, print a page, and extract text. For the creation and editing of PDF files, please refer to another tutorial from me. Preparation The core lies in using external module PyPDF2. First, install it using pip: pip is P

How to Cache Using Redis in Django Applications Mar 02, 2025 am 10:10 AM

This tutorial demonstrates how to leverage Redis caching to boost the performance of Python applications, specifically within a Django framework. We'll cover Redis installation, Django configuration, and performance comparisons to highlight the bene

Introducing the Natural Language Toolkit (NLTK) Mar 01, 2025 am 10:05 AM

Natural language processing (NLP) is the automatic or semi-automatic processing of human language. NLP is closely related to linguistics and has links to research in cognitive science, psychology, physiology, and mathematics. In the computer science

How to Perform Deep Learning with TensorFlow or PyTorch? Mar 10, 2025 pm 06:52 PM

This article compares TensorFlow and PyTorch for deep learning. It details the steps involved: data preparation, model building, training, evaluation, and deployment. Key differences between the frameworks, particularly regarding computational grap

See all articles