Example of implementing Baidu speech recognition function in Python language

小云云
Release: 2017-12-14 11:29:17
Original
2737 people have browsed it

In the future, artificial intelligence will occupy a very important position in the market, and Python language is the best programming language for studying artificial intelligence. Now, let us feel its charm!

The sample program provided by Baidu, whether C or Java version, is divided into two types: method1 and method2. The former is called implicit (the post is a json string, and the audio data is encoded into json), and the latter It is called explicit (the post is audio data). This article mainly introduces the usage examples of Baidu speech recognition API implemented in Python language. It has certain reference value. Friends in need can refer to it. I hope it can help everyone.

At the beginning, I considered that the pythonwave package dealt with "strings", and I was worried that it would be inconsistent with the C language array, so I chose the inefficient but safe method1,

That is, first base64 the audio data Encoding, plus sampling rate, number of channels and other information are gathered into a dict, and finally encoded into a json string

The result is always reported:

3300 The input parameters are incorrect

I have tried the urllib2 and pycurl packages successively, and both are in the above situation.

I had to switch to method2 and succeeded (it seems that the wave package does not store audio as a "string")

#encoding=utf-8 
import wave 
import urllib, urllib2, pycurl 
import base64 
import json 
## get access token by api key & secret key 
 
def get_token(): 
  apiKey = "xxxxxxxx" 
  secretKey = "xxxxxxxxx" 
  auth_url = "https://openapi.baidu.com/oauth/2.0/token?grant_type=client_credentials&client_id=" + apiKey + "&client_secret=" + secretKey; 
  res = urllib2.urlopen(auth_url) 
  json_data = res.read() 
  return json.loads(json_data)['access_token'] 
 
def dump_res(buf): 
  print buf 
## post audio to server 
def use_cloud(token): 
  fp = wave.open('vad_0.wav', 'rb') 
  nf = fp.getnframes() 
  f_len = nf * 2 
  audio_data = fp.readframes(nf) 
 
  cuid = "xxxxxxxxxx" #my xiaomi phone MAC 
  srv_url = 'http://vop.baidu.com/server_api' + '?cuid=' + cuid + '&token=' + token 
  http_header = [ 
    'Content-Type: audio/pcm; rate=8000', 
    'Content-Length: %d' % f_len 
  ] 
  c = pycurl.Curl() 
  c.setopt(pycurl.URL, str(srv_url)) #curl doesn't support unicode 
  #c.setopt(c.RETURNTRANSFER, 1) 
  c.setopt(c.HTTPHEADER, http_header)  #must be list, not dict 
  c.setopt(c.POST, 1) 
  c.setopt(c.CONNECTTIMEOUT, 30) 
  c.setopt(c.TIMEOUT, 30) 
  c.setopt(c.WRITEFUNCTION, dump_res) 
  c.setopt(c.POSTFIELDS, audio_data) 
  c.setopt(c.POSTFIELDSIZE, f_len) 
  c.perform() #pycurl.perform() has no return val  
if __name__ == "__main__": 
  token = get_token() 
  use_cloud(token)
Copy after login

Run results

{"corpus_no":"6150045491002357923","err_msg":"success.","err_no":0,"result":["播放小苹果,"],"sn":"243903724071431919050"}
Copy after login

Related recommendations:

.Net development WeChat Detailed explanation of speech recognition examples on public platforms

A brief analysis of how to use JavaScript for speech recognition

HTML5 speech recognition tag writing method with pictures_html5 tutorial skills

The above is the detailed content of Example of implementing Baidu speech recognition function in Python language. For more information, please follow other related articles on the PHP Chinese website!

Related labels:
source:php.cn
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template
About us Disclaimer Sitemap
php.cn:Public welfare online PHP training,Help PHP learners grow quickly!