Home > WeChat Applet > Mini Program Development > Mini program realizes speech recognition experience sharing

Mini program realizes speech recognition experience sharing

小云云
Release: 2018-02-08 16:02:03
Original
6735 people have browsed it

I wrote a tool-type WeChat applet (Find peripheral) before, which used speech recognition technology. This article mainly shares with you the experience of implementing speech recognition in small programs, hoping to help everyone.

Interface Preview

By reading and understanding the IFLYTEK interface documentation, mini program interface development documentation, and learning about the back-end ThinkPhp framework, I compiled the following development steps:

  • Register iFlytek account (The pride of Chinese people, the world’s leading speech recognition technology)

  • Enter AIUI open The platform creates an application in the application management and records the APPID and ApiKey

  • Enter the application configuration and configure the situation mode, identification method and skills that match your own

  • Develop a small program to record the audio that needs to be recognized (detailed below)

  • The back-end transcodes the recorded audio (iFlytek supports pcm, wav) and submits it to the recognition interface (below) Detailed description)

  • The applet receives the recognition result and proceeds with the next business

Audio recording interface

  • wx.startRecord() and wx.stopRecord()

wx.startRecord() and wx.stopRecord() interfaces can also meet the needs, but Starting from version 1.6.0, it is no longer maintained by the WeChat team. It is recommended to use the more capable wx.getRecorderManager interface. The audio format obtained by this interface is silk.
silk is the result of base64 encoding of webm format. After decoding, we need to convert webm into pcm and wav
  • ##wx.getRecorderManager()

Compared with the wx.startRecord() interface, this interface provides more powerful capabilities (details). You can pause or continue recording, and set the encoding bit rate and number of recording channels according to your needs. Sampling Rate. The most enjoyable thing is that you can specify the audio format, the valid value is aac/mp3. The bad thing is that wx.getRecorderManager() was only supported in 1.6.0. Of course, if you want to be compatible with low-end WeChat users, you need to use wx.startRecord() for compatibility processing.
  • Event monitoring details

// wxjs:

const recorderManager = wx.getRecorderManager()
recorderManager.onStart(() => {
    //开始录制的回调方法
})
//录音停止函数
recorderManager.onStop((res) => {
  const { tempFilePath } = res;
  //上传录制的音频
  wx.uploadFile({
    url: app.d.hostUrl + '/Api/Index/wxupload', //仅为示例,非真实的接口地址
    filePath: tempFilePath,
    name: 'viceo',
    success: function (res) {
        console.log(res);
    }
  })
})

Page({
    //按下按钮--录音
  startHandel: function () {
    console.log("开始")
    recorderManager.start({
      duration: 10000
    })
  },
  //松开按钮
  endHandle: function () {
    console.log("结束")
    //触发录音停止
    recorderManager.stop()
  }
})

//wxml:
<view bindtouchstart=&#39;startHandel&#39; bindtouchend=&#39;endHandle&#39; class="tapview">
    <text>{{text}}</text>
</view>
Copy after login

Audio conversion

My backend here uses php The open source framework thinkphp, of course node, java, python and other back-end languages ​​are available, you can choose according to your own preferences and abilities. To do audio transcoding, we need to use the audio and video transcoding tools ffmpeg and avconv, which all rely on gcc. You can Baidu yourself during the installation process, or follow the article link at the bottom.

<?php
namespace Api\Controller;
use Think\Controller;
class IndexController extends Controller {
    
    //音频上传编解码
    public function wxupload(){
        $upload_res=$_FILES[&#39;viceo&#39;];
        $tempfile = file_get_contents($upload_res[&#39;tmp_name&#39;]);
        $wavname = substr($upload_res[&#39;name&#39;],0,strripos($upload_res[&#39;name&#39;],".")).".wav";
        $arr = explode(",", $tempfile);
        $path = &#39;Aduio/&#39;.$upload_res[&#39;name&#39;];
        
        if ($arr && !empty(strstr($tempfile,&#39;base64&#39;))){
            //微信模拟器录制的音频文件可以直接存储返回
            file_put_contents($path, base64_decode($arr[1]));
            $data[&#39;path&#39;] = $path;
            apiResponse("success","转码成功!",$data);
        }else{
            //手机录音文件
            $path = &#39;Aduio/&#39;.$upload_res[&#39;name&#39;];
            $newpath = &#39;Aduio/&#39;.$wavname;
            file_put_contents($path, $tempfile);
            chmod($path, 0777);
            $exec1 = "avconv -i /home/wwwroot/mapxcx.kanziqiang.top/$path -vn -f wav /home/wwwroot/mapxcx.kanziqiang.top/$newpath";
            exec($exec1,$info,$status);
            chmod($newpath, 0777);
            if ( !empty($tempfile) && $status == 0 ) {
                $data[&#39;path&#39;] = $newpath;
                apiResponse("success","转码成功!",$data);
            }
        }
        apiResponse("error","发生未知错误!");
    }
    //json数据返回方法封装
    function apiResponse($flag = &#39;error&#39;, $message = &#39;&#39;,$data = array()){
        $result = array(&#39;flag&#39;=>$flag,'message'=>$message,'data'=>$data);
        print json_encode($result);exit;
    }
}
Copy after login

Call the identification interface

After we prepare the file, we can then request and transfer the base64-encoded audio file through the api interface. During this period, we must pay attention to transmitting in strict accordance with the specifications stated in the document, otherwise unknown results will occur.

<?php
namespace Api\Controller;
use Think\Controller;
class IndexController extends Controller {
    public function _initialize(){
    }
    //封装数据请求方法
    public function httpsRequest($url,$data = null,$xparam){
        $curl = curl_init();
        curl_setopt($curl, CURLOPT_URL, $url);
        curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, FALSE);
        curl_setopt($curl, CURLOPT_SSL_VERIFYHOST, FALSE);
        curl_setopt($curl, CURLOPT_HEADER, 0);
        $Appid = "";//开放平台的appid
        $Appkey = "";//开放平台的Appkey
        $curtime = time();
        $CheckSum = md5($Appkey.$curtime.$xparam.$data);
        $headers = array(
            &#39;X-Appid:&#39;.$Appid,
            &#39;X-CurTime:&#39;.$curtime,
            &#39;X-CheckSum:&#39;.$CheckSum,
            &#39;X-Param:&#39;.$xparam,
            &#39;Content-Type:&#39;.&#39;application/x-www-form-urlencoded; charset=utf-8&#39;
            );
        curl_setopt($curl, CURLOPT_HTTPHEADER, $headers);
        if (!empty($data)){
            curl_setopt($curl, CURLOPT_POST, 1);
            curl_setopt($curl, CURLOPT_POSTFIELDS, $data);
        }
        curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
        $output = curl_exec($curl);
        curl_close($curl);
        return $output;
    }
    //请求接口数据处理
    public function getVoice($path){
        $d = base64_encode($path);
        $url = "https://api.xfyun.cn/v1/aiui/v1/voice_semantic";
        $xparam = base64_encode( json_encode(array(&#39;scene&#39; => 'main','userid'=>'user_0001',"auf"=>"16k","aue"=>"raw","spx_fsize"=>"60" )));
        $data = "data=".$d;
        $res = $this->httpsRequest($url,$data,$xparam);
        if(!empty($res) && $res['code'] == 00000){
            apiResponse("success","识别成功!",$res);
        }else{
            apiResponse("error","识别失败!");
        }
    }
    //数据返回封装
    function apiResponse($flag = 'error', $message = '',$data = array()){
        $result = array('flag'=>$flag,'message'=>$message,'data'=>$data);
        print json_encode($result);exit;
    }
}
Copy after login
This is basically done. The above code has been compiled and may not necessarily meet your actual development needs. If you find anything inappropriate, please feel free to communicate via WeChat (xiaoqiang0672).

If you want to see actual cases, you can scan the QR code on WeChat

-

Mini program realizes speech recognition experience sharing##Related recommendations:

. Net development of WeChat public platform speech recognition code analysis

About the call of WeChat public platform speech recognition results

HTML5 speech recognition tag writing method with pictures _html5 tutorial skills

The above is the detailed content of Mini program realizes speech recognition experience sharing. For more information, please follow other related articles on the PHP Chinese website!

Related labels:
source:php.cn
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template