Detailed explanation of code for python recognition verification code

Y2J
Release: 2017-05-08 16:06:14
Original
2406 people have browsed it

This article mainly introduces the relevant information on identifying verification codes in python. This is a basic introductory tutorial for learning python. The introduction in the article is very detailed. A complete sample code is also given at the end of the article. Friends who need it can refer to it. , let’s take a look below.

Preface

Verification code? Can I crack it too?

I won’t say much about the introduction of verification codes. Various verification codes appear from time to time in people’s lives. As a student, the one you come into contact with most every day is the system of the Academic Affairs Office. Verification code, such as the following verification code:

Identification method

Simulated login has complicated steps. Here we ignore other operations and are only responsible for returning an answer string based on the input verification code picture.

We know that the verification code will make the picture colorful in order to create interference, and we first need to remove these interferences. This step requires continuous experimentation, enhancing the color of the picture, increasing the contrast, etc. can help.

After various operations on the pictures, I finally found a more perfect solution for removing interference. It can be seen that after removing the interference, under optimal circumstances, we will get a very pure black and white character picture. There are four characters in a picture. It is impossible to recognize all four characters at once. The picture needs to be cropped so that each small picture has only one character, and then each picture is recognized separately.

#The next step is to recognize the text, we First, convert the obtained small picture into a matrix represented by 01, each matrix represents a character.


For example, the matrix of the number six

num_6=[
0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,1,1,0,0,0,0,0,0,
0,0,0,0,1,1,1,0,0,0,0,0,0,
0,0,0,1,1,1,0,0,0,0,0,0,0,
0,0,0,1,1,0,0,0,0,0,0,0,0,
0,0,1,1,0,0,0,0,0,0,0,0,0,
0,0,1,1,0,0,0,0,0,0,0,0,0,
0,1,1,1,1,1,1,1,0,0,0,0,0,
0,1,1,1,1,1,1,1,1,0,0,0,0,
0,1,1,0,0,0,0,1,1,1,0,0,0,
0,1,1,0,0,0,0,0,1,1,0,0,0,
0,1,1,0,0,0,0,0,1,1,0,0,0,
0,1,1,1,0,0,0,1,1,1,0,0,0,
0,0,1,1,1,1,1,1,1,0,0,0,0,
0,0,0,1,1,1,1,1,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0,
]
Copy after login

Looking at it from a distance, you can still distinguish it if you squint.


Because the verification code is very regular and the position of each number is fixed, there is no need to involve any machine learning algorithm. It is just a simple comparison of the matrices. Just find the matrix with the highest similarity among all the implemented matrices. There are various comparison methods here. Anyway, as long as the data is simple and can be correctly identified.

At this point, our verification code identification work is over.

The verification code recognition carried out this time mainly uses python's PIL for image manipulation. Please see here for all the codes to simulate login and automatically fill in the verification code:

Sample code

# -*- coding: utf-8 -*
import sys
reload(sys)
sys.setdefaultencoding( "utf-8" )
import re
import requests
import io
import os
import json
from PIL import Image
from PIL import ImageEnhance
from bs4 import BeautifulSoup

import mdata

class Student:
 def init(self, user,password):
  self.user = str(user)
  self.password = str(password)
  self.s = requests.Session()

 def login(self):
  url = "http://202.118.31.197/ACTIONLOGON.APPPROCESS?mode=4"
  res = self.s.get(url).text
  imageUrl = &#39;http://202.118.31.197/&#39;+re.findall(&#39;<img src="(.+?)" width="55"&#39;,res)[0]
  im = Image.open(io.BytesIO(self.s.get(imageUrl).content))
  enhancer = ImageEnhance.Contrast(im)
  im = enhancer.enhance(7)
  x,y = im.size
  for i in range(y):
   for j in range(x):
    if (im.getpixel((j,i))!=(0,0,0)):
     im.putpixel((j,i),(255,255,255))
  num = [6,19,32,45]
  verifyCode = ""
  for i in range(4):
   a = im.crop((num[i],0,num[i]+13,20))
   l=[]
   x,y = a.size
   for i in range(y):
    for j in range(x):
     if (a.getpixel((j,i))==(0,0,0)):
      l.append(1)
     else:
      l.append(0)
   his=0
   chrr="";
   for i in mdata.data:
    r=0;
    for j in range(260):
     if(l[j]==mdata.data[i][j]):
      r+=1
    if(r>his):
     his=r
     chrr=i
   verifyCode+=chrr
   # print "辅助输入验证码完毕:",verifyCode
  data= {
  &#39;WebUserNO&#39;:str(self.user),
  &#39;Password&#39;:str(self.password),
  &#39;Agnomen&#39;:verifyCode,
  }
  url = "http://202.118.31.197/ACTIONLOGON.APPPROCESS?mode=4"
  t = self.s.post(url,data=data).text
  if re.findall("images/Logout2",t)==[]:
   l = &#39;[0,"&#39;+re.findall(&#39;alert((.+?));&#39;,t)[1][1][2:-2]+&#39;"]&#39;+" "+self.user+" "+self.password+"\n"
   # print l
   # return &#39;[0,"&#39;+re.findall(&#39;alert((.+?));&#39;,t)[1][1][2:-2]+&#39;"]&#39;
   return [False,l]
  else:
   l = &#39;登录成功 &#39;+re.findall(&#39;! (.+?) &#39;,t)[0]+" "+self.user+" "+self.password+"\n"
   # print l
   return [True,l]

 def getInfo(self):
  imageUrl = &#39;http://202.118.31.197/ACTIONDSPUSERPHOTO.APPPROCESS&#39;
  data = self.s.get(&#39;http://202.118.31.197/ACTIONQUERYBASESTUDENTINFO.APPPROCESS?mode=3&#39;).text #学籍信息
  data = BeautifulSoup(data,"lxml")
  q = data.find_all("table",attrs={&#39;align&#39;:"left"})
  a = []
  for i in q[0]:
   if type(i)==type(q[0]) :
    for j in i :
     if type(j) ==type(i):
      a.append(j.text)
  for i in q[1]:
   if type(i)==type(q[1]) :
    for j in i :
     if type(j) ==type(i):
      a.append(j.text)
  data = {}
  for i in range(1,len(a),2):
   data[a[i-1]]=a[i]
  # data[&#39;照片&#39;] = io.BytesIO(self.s.get(imageUrl).content)
  return json.dumps(data)

 def getPic(self):
  imageUrl = &#39;http://202.118.31.197/ACTIONDSPUSERPHOTO.APPPROCESS&#39;
  pic = Image.open(io.BytesIO(self.s.get(imageUrl).content))
  return pic

 def getScore(self):
   score = self.s.get(&#39;http://202.118.31.197/ACTIONQUERYSTUDENTSCORE.APPPROCESS&#39;).text #成绩单
   score = BeautifulSoup(score, "lxml")
   q = score.find_all(attrs={&#39;height&#39;:"36"})[0]
   point = q.text
   print point[point.find(&#39;平均学分绩点&#39;):]
   table = score.html.body.table
   people = table.find_all(attrs={&#39;height&#39; : &#39;36&#39;})[0].string
   r = table.find_all(&#39;table&#39;,attrs={&#39;align&#39; : &#39;left&#39;})[0].find_all(&#39;tr&#39;)
   subject = []
   lesson = []
   for i in r[0]:
    if type(r[0])==type(i):
     subject.append(i.string)
   for i in r:
    k=0
    temp = {}
    for j in i:
     if type(r[0])==type(j):
      temp[subject[k]] = j.string
      k+=1
    lesson.append(temp)
   lesson.pop()
   lesson.pop(0)
   return json.dumps(lesson)

 def logoff(self):
  return self.s.get(&#39;http://202.118.31.197/ACTIONLOGOUT.APPPROCESS&#39;).text

if name == "main":
 a = Student(20150000,20150000)
 r = a.login()
 print r[1]
 if r[0]:
  r = json.loads(a.getScore())
  for i in r:
   for j in i:
    print i[j],
   print
  q = json.loads(a.getInfo())
  for i in q:
   print i,q[i]
  a.getPic().show()
 a.logoff()
Copy after login

【Related recommendations】


1.

Python free video tutorial

2.

Object-oriented video tutorial

3.

Python Learning Manual

The above is the detailed content of Detailed explanation of code for python recognition verification code. For more information, please follow other related articles on the PHP Chinese website!

Related labels:
source:php.cn
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template