PHP is a widely used programming language that can be used to develop web applications, dynamic web pages, command line scripts, and various other applications. As artificial intelligence technology continues to develop, voice technology has also been widely used. iFlytek Voice is a company that provides voice technology services. It can provide functional interfaces such as speech recognition and speech synthesis for various applications. This article will introduce how to use PHP to access iFlytek voice services to implement speech recognition and speech synthesis functions.
1. Register for iFlytek Open Platform
To access iFlytek voice services, you need to register an iFlytek open platform account first, and visit the website https://www.xfyun.cn/ to register. After the registration is completed, you can create an application in the open platform console and obtain the three parameters of AppID, API Key and API Secret. These parameters will be used to call the iFlytek voice service API.
2. Speech recognition interface
iFlytek Voice provides a variety of speech recognition interfaces, including online speech recognition interfaces, offline speech recognition interfaces, and customized speech recognition interfaces. Before using the speech recognition interface, you need to record the voice through a microphone or other recording device and convert the audio format into an audio format that meets the requirements of the iFlytek voice interface.
The online speech recognition interface refers to passing the recorded audio file to the iFlytek voice server, and the server performs speech recognition and returns the recognition result. Using this interface requires authentication first and obtaining an access token. The following is the sample code:
<?php $url = "https://api.xfyun.cn/v1/service/v1/iat"; //讯飞开放平台中应用的appID和appSecret $appid = "5*****9"; $apiKey = "4****************4e4e4ebc"; $apiSecret = "6cd**************************5ba"; //当前时间戳秒数 $ts = time(); //身份证号码(加密) $idCard = md5("123456789012345678"); //要转换的音频文件路径 $audioFilePath = "/path/audio.pcm"; if(!file_exists($audioFilePath)){ echo "文件不存在"; die; } //二进制方式打开文件 $audioFile = file_get_contents($audioFilePath); //对音频文件进行base64编码 $audioData = base64_encode($audioFile); //请求头 $header = array( "Content-Type:application/x-www-form-urlencoded; charset=utf-8", "X-Appid: ".$appid, "X-CurTime: ".$ts, "X-Param:eyJ0eXBlIjoic3lzdGVtIiwibmFtZSI6ImlhdCJ9", "X-CheckSum:".md5($apiKey.$ts.$idCard.$audioData.$apiSecret), ); //请求数据 $data = "audio=".$audioData."&engine_type=cloud"; //发送HTTP POST请求 $ch = curl_init(); curl_setopt($ch, CURLOPT_URL, $url); curl_setopt($ch, CURLOPT_HTTPHEADER, $header); curl_setopt($ch, CURLOPT_RETURNTRANSFER, true); curl_setopt($ch, CURLOPT_POST, true); curl_setopt($ch, CURLOPT_POSTFIELDS, $data); $output = curl_exec($ch); curl_close($ch); //解析结果 $result = json_decode($output, true); if($result && isset($result["data"])){ echo $result["data"]; }else{ echo "错误信息:".$output; } ?>
The sample code uses the curl library to send an HTTP POST request and convert the recorded audio file into a binary stream, then base64 encode it and transmit it to iFlytek as a request parameter Voice server. At the same time, this code also adds the ID number (md5 encryption) as a parameter and passes it into the checksum field of the request header.
The offline speech recognition interface refers to matching and recognizing the recorded audio files with the offline recognition model provided by iFlytek, and returning the recognition results . Using this interface requires downloading the offline recognition model locally and creating a grammar file in advance.
Download offline recognition model
Go to iFlytek official website to download the offline recognition model of the corresponding language, find the voice dictation module in the open platform console, enter the module settings page, and copy the model file download link Go to your local computer and unzip it.
Create a grammar file
You need to specify a grammar file for offline speech recognition. The grammar file can be in JSGF (Java Speech Grammar Format) format or Bnf (Backus-Naur Form) format. The following is a simple JSGF syntax file example:
#JSGF V1.0; grammar sample; public <command> = 开灯 | 关灯 | 调亮度 | 调色温 | 播放音乐 | 暂停音乐 | 下一曲 | 上一曲 | 音量调大 | 音量调小;
In this example, the syntax file defines a command, including turning on lights, turning off lights, adjusting brightness, adjusting color temperature, playing music, pausing music, and next First song, previous song, volume up, volume down, etc. For recorded audio files, the system will match the defined commands according to the grammar file to achieve offline speech recognition.
The customized speech recognition interface allows users to train models based on their own data sets, and then use the customized speech recognition interface provided by iFlytek Identify. Before using the customized speech recognition interface, you need to upload and train the data set first. After the training is completed, you can call the interface for speech recognition. The interface calling method is similar to the online speech recognition interface and will not be described again.
3. Speech synthesis interface
The speech synthesis interface refers to synthesizing the specified text into a speech audio file and returning the URL of the audio file. Using the speech synthesis interface requires authentication and obtaining an access token. Next is the sample code:
<?php $url = "https://api.xfyun.cn/v1/service/v1/tts"; //讯飞开放平台中应用的appID和appSecret $appid = "5*****9"; $apiKey = "4****************4e4e4ebc"; $apiSecret = "6cd**************************5ba"; //要合成的文本内容 $text = "讯飞语音,智能语音,畅想未来"; //当前时间戳秒数 $ts = time(); //身份证号码(加密) $idCard = md5("123456789012345678"); //请求头 $header = array( "Content-Type:application/x-www-form-urlencoded; charset=utf-8", "X-Appid: ".$appid, "X-CurTime: ".$ts, "X-Param:eyJlbmdpbmVfdHlwZSI6IndlYiIsImRlc2NyaXB0aW9uIjoiMTAwLicipOyAgVGhpcyBtZXRob2Qgd29ya3MgY2FuIGhlYXBzaG90ICogZnJvbSB1c2VyICsgJyMxMjM0NTY3ODkwMTIzNDU2Nzg5MDEyMzQ1Njc4OTAxMjMn IiwiaXNzdWVkQ29kZSI6IjQ2MzkzIn0=", "X-CheckSum:".md5($apiKey.$ts.$idCard.$text.$apiSecret), ); //请求数据 $data = "text=".$text."&auf=audio/L16;rate=16000&voice_name=xiaoyan&engine_type=intp65&speed=50&volume=50&pitch=50"; //发送HTTP POST请求 $ch = curl_init(); curl_setopt($ch, CURLOPT_URL, $url); curl_setopt($ch, CURLOPT_HTTPHEADER, $header); curl_setopt($ch, CURLOPT_RETURNTRANSFER, true); curl_setopt($ch, CURLOPT_POST, true); curl_setopt($ch, CURLOPT_POSTFIELDS, $data); $output = curl_exec($ch); curl_close($ch); //解析结果 $result = json_decode($output, true); if($result && isset($result["data"])){ echo $result["data"]["url"]; }else{ echo "错误信息:".$output; } ?>
This sample code uses the curl library to send an HTTP POST request, and sends the text content to be synthesized as a request parameter to the iFlytek voice server. At the same time, the voice sampling rate, timbre, speaking speed, volume, pitch and other parameters are also set. The final return is the synthesized audio file URL.
4. Summary
This article introduces how to use PHP to access iFlytek voice services, including the implementation process of online speech recognition interface, offline speech recognition interface and speech synthesis interface. Developers can choose appropriate interfaces for development based on their needs to add voice technology support to their applications.
The above is the detailed content of How to connect php project to iFlytek voice service (process). For more information, please follow other related articles on the PHP Chinese website!