I wrote a tool-type WeChat applet (Find peripheral) before, which used speech recognition technology. This article mainly shares with you the experience of implementing speech recognition in small programs, hoping to help everyone.
Interface Preview
By reading and understanding the IFLYTEK interface documentation, mini program interface development documentation, and learning about the back-end ThinkPhp framework, I compiled the following development steps:
Register iFlytek account (The pride of Chinese people, the world’s leading speech recognition technology)
Enter AIUI open The platform creates an application in the application management and records the APPID and ApiKey
Enter the application configuration and configure the situation mode, identification method and skills that match your own
Develop a small program to record the audio that needs to be recognized (detailed below)
The back-end transcodes the recorded audio (iFlytek supports pcm, wav) and submits it to the recognition interface (below) Detailed description)
The applet receives the recognition result and proceeds with the next business
Audio recording interface
wx.startRecord() and wx.stopRecord()
wx.startRecord() and wx.stopRecord() interfaces can also meet the needs, but Starting from version 1.6.0, it is no longer maintained by the WeChat team. It is recommended to use the more capable wx.getRecorderManager interface. The audio format obtained by this interface is silk.
silk is the result of base64 encoding of webm format. After decoding, we need to convert webm into pcm and wav
// wxjs: const recorderManager = wx.getRecorderManager() recorderManager.onStart(() => { //开始录制的回调方法 }) //录音停止函数 recorderManager.onStop((res) => { const { tempFilePath } = res; //上传录制的音频 wx.uploadFile({ url: app.d.hostUrl + '/Api/Index/wxupload', //仅为示例,非真实的接口地址 filePath: tempFilePath, name: 'viceo', success: function (res) { console.log(res); } }) }) Page({ //按下按钮--录音 startHandel: function () { console.log("开始") recorderManager.start({ duration: 10000 }) }, //松开按钮 endHandle: function () { console.log("结束") //触发录音停止 recorderManager.stop() } }) //wxml: <view bindtouchstart='startHandel' bindtouchend='endHandle' class="tapview"> <text>{{text}}</text> </view>
Audio conversion
My backend here uses php The open source framework thinkphp, of course node, java, python and other back-end languages are available, you can choose according to your own preferences and abilities. To do audio transcoding, we need to use the audio and video transcoding tools ffmpeg and avconv, which all rely on gcc. You can Baidu yourself during the installation process, or follow the article link at the bottom.<?php namespace Api\Controller; use Think\Controller; class IndexController extends Controller { //音频上传编解码 public function wxupload(){ $upload_res=$_FILES['viceo']; $tempfile = file_get_contents($upload_res['tmp_name']); $wavname = substr($upload_res['name'],0,strripos($upload_res['name'],".")).".wav"; $arr = explode(",", $tempfile); $path = 'Aduio/'.$upload_res['name']; if ($arr && !empty(strstr($tempfile,'base64'))){ //微信模拟器录制的音频文件可以直接存储返回 file_put_contents($path, base64_decode($arr[1])); $data['path'] = $path; apiResponse("success","转码成功!",$data); }else{ //手机录音文件 $path = 'Aduio/'.$upload_res['name']; $newpath = 'Aduio/'.$wavname; file_put_contents($path, $tempfile); chmod($path, 0777); $exec1 = "avconv -i /home/wwwroot/mapxcx.kanziqiang.top/$path -vn -f wav /home/wwwroot/mapxcx.kanziqiang.top/$newpath"; exec($exec1,$info,$status); chmod($newpath, 0777); if ( !empty($tempfile) && $status == 0 ) { $data['path'] = $newpath; apiResponse("success","转码成功!",$data); } } apiResponse("error","发生未知错误!"); } //json数据返回方法封装 function apiResponse($flag = 'error', $message = '',$data = array()){ $result = array('flag'=>$flag,'message'=>$message,'data'=>$data); print json_encode($result);exit; } }
Call the identification interface
After we prepare the file, we can then request and transfer the base64-encoded audio file through the api interface. During this period, we must pay attention to transmitting in strict accordance with the specifications stated in the document, otherwise unknown results will occur.<?php namespace Api\Controller; use Think\Controller; class IndexController extends Controller { public function _initialize(){ } //封装数据请求方法 public function httpsRequest($url,$data = null,$xparam){ $curl = curl_init(); curl_setopt($curl, CURLOPT_URL, $url); curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, FALSE); curl_setopt($curl, CURLOPT_SSL_VERIFYHOST, FALSE); curl_setopt($curl, CURLOPT_HEADER, 0); $Appid = "";//开放平台的appid $Appkey = "";//开放平台的Appkey $curtime = time(); $CheckSum = md5($Appkey.$curtime.$xparam.$data); $headers = array( 'X-Appid:'.$Appid, 'X-CurTime:'.$curtime, 'X-CheckSum:'.$CheckSum, 'X-Param:'.$xparam, 'Content-Type:'.'application/x-www-form-urlencoded; charset=utf-8' ); curl_setopt($curl, CURLOPT_HTTPHEADER, $headers); if (!empty($data)){ curl_setopt($curl, CURLOPT_POST, 1); curl_setopt($curl, CURLOPT_POSTFIELDS, $data); } curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1); $output = curl_exec($curl); curl_close($curl); return $output; } //请求接口数据处理 public function getVoice($path){ $d = base64_encode($path); $url = "https://api.xfyun.cn/v1/aiui/v1/voice_semantic"; $xparam = base64_encode( json_encode(array('scene' => 'main','userid'=>'user_0001',"auf"=>"16k","aue"=>"raw","spx_fsize"=>"60" ))); $data = "data=".$d; $res = $this->httpsRequest($url,$data,$xparam); if(!empty($res) && $res['code'] == 00000){ apiResponse("success","识别成功!",$res); }else{ apiResponse("error","识别失败!"); } } //数据返回封装 function apiResponse($flag = 'error', $message = '',$data = array()){ $result = array('flag'=>$flag,'message'=>$message,'data'=>$data); print json_encode($result);exit; } }
-
##Related recommendations:
The above is the detailed content of Mini program realizes speech recognition experience sharing. For more information, please follow other related articles on the PHP Chinese website!