


C++ calls Microsoft's own speech recognition interface to get started quickly
Quick Start with C++ Speech Recognition Interface (Microsoft Speech SDK)
I recently used Microsoft’s C++ speech recognition interface in my graduation project and searched a lot I also encountered many problems with the materials and took many detours. Now I write down my own experience, one is to improve myself, and the other is to repay the society. I hope that after reading this blog, you will learn how to implement the C++ speech recognition interface in 5 minutes. (The platform used is win8+VS2013)
1. Install SDK
Install MicrosoftSpeechPlatformSDK.msi, just install it to the default path.
Download path:
download.csdn.net/detail/michaelliang12/9510691
2. Create a new project and configure the environment
Settings:
1, Properties – Configuration properties –C/C++–General–Additional include directory: C:\Program Files\Microsoft SDKs\Speech\v11.0\Include (the specific path is related to the installation path)
2, Properties – Configuration Properties – Linker – Input – Additional dependencies: sapi.lib;
3. Speech recognition code
The speech recognition interface can be divided into text-to-speech and speech-to-text
1. Text-to-speech
Header files that need to be added:
#include <sapi.h> //导入语音头文件#pragma comment(lib,"sapi.lib") //导入语音头文件库
Function:
void CBodyBasics::MSSSpeak(LPCTSTR speakContent)// speakContent为LPCTSTR型的字符串,调用此函数即可将文字转为语音{ ISpVoice *pVoice = NULL; //初始化COM接口 if (FAILED(::CoInitialize(NULL))) MessageBox(NULL, (LPCWSTR)L"COM接口初始化失败!", (LPCWSTR)L"提示", MB_ICONWARNING | MB_CANCELTRYCONTINUE | MB_DEFBUTTON2); //获取SpVoice接口 HRESULT hr = CoCreateInstance(CLSID_SpVoice, NULL, CLSCTX_ALL, IID_ISpVoice, (void**)&pVoice); if (SUCCEEDED(hr)) { pVoice->SetVolume((USHORT)100); //设置音量,范围是 0 -100 pVoice->SetRate(2); //设置速度,范围是 -10 - 10 hr = pVoice->Speak(speakContent, 0, NULL); pVoice->Release(); pVoice = NULL; } //释放com资源 ::CoUninitialize(); }
2. Voice to text
This is a little more troublesome because it requires real-time monitoring of the microphone, which involves to the messaging mechanism of Windows.
(1) First set the project properties:
Properties – Configuration Properties – C/C++ – Preprocessor – Preprocessor Definition: _WIN32_DCOM;
(2) Header files that need to be added:
#include <sapi.h> //导入语音头文件#pragma comment(lib,"sapi.lib") //导入语音头文件库#include//语音识别头文件#include //要用到CString#pragma onceconst int WM_RECORD = WM_USER + 100;//定义消息
(3) Define variables in the .h header file of the program
//定义变量CComPtr<ISpRecognizer>m_cpRecoEngine;// 语音识别引擎(recognition)的接口。CComPtr<ISpRecoContext>m_cpRecoCtxt;// 识别引擎上下文(context)的接口。CComPtr<ISpRecoGrammar>m_cpCmdGrammar;// 识别文法(grammar)的接口。CComPtr<ISpStream>m_cpInputStream;// 流()的接口。CComPtr<ISpObjectToken>m_cpToken;// 语音特征的(token)接口。CComPtr<ISpAudio>m_cpAudio;// 音频(Audio)的接口。(用来保存原来默认的输入流)ULONGLONG ullGrammerID;
(4) Create a speech recognition initialization function (called when the program just starts executing, for example, in the example code at the end of the article, This initialization function is placed in the response code of the dialog box initialization message WM_INITDIALOG)
//语音识别初始化函数void CBodyBasics::MSSListen() { //初始化COM接口 if (FAILED(::CoInitialize(NULL))) MessageBox(NULL, (LPCWSTR)L"COM接口初始化失败!", (LPCWSTR)L"提示", MB_ICONWARNING | MB_CANCELTRYCONTINUE | MB_DEFBUTTON2); HRESULT hr = m_cpRecoEngine.CoCreateInstance(CLSID_SpSharedRecognizer);//创建Share型识别引擎 if (SUCCEEDED(hr)) { hr = m_cpRecoEngine->CreateRecoContext(&m_cpRecoCtxt);//创建识别上下文接口 hr = m_cpRecoCtxt->SetNotifyWindowMessage(m_hWnd, WM_RECORD, 0, 0);//设置识别消息 const ULONGLONG ullInterest = SPFEI(SPEI_SOUND_START) | SPFEI(SPEI_SOUND_END) | SPFEI(SPEI_RECOGNITION);//设置我们感兴趣的事件 hr = m_cpRecoCtxt->SetInterest(ullInterest, ullInterest); hr = SpCreateDefaultObjectFromCategoryId(SPCAT_AUDIOIN, &m_cpAudio); m_cpRecoEngine->SetInput(m_cpAudio, true); //创建语法规则 //dictation听说式 //hr = m_cpRecoCtxt->CreateGrammar(GIDDICTATION, &m_cpDictationGrammar); //if (SUCCEEDED(hr)) //{ // hr = m_cpDictationGrammar->LoadDictation(NULL, SPLO_STATIC);//加载词典 //} //C&C命令式,此时语法文件使用xml格式 ullGrammerID = 1000; hr = m_cpRecoCtxt->CreateGrammar(ullGrammerID, &m_cpCmdGrammar); WCHAR wszXMLFile[20] = L"";//加载语法 MultiByteToWideChar(CP_ACP, 0, (LPCSTR)"CmdCtrl.xml", -1, wszXMLFile, 256);//ANSI转UNINCODE hr = m_cpCmdGrammar->LoadCmdFromFile(wszXMLFile, SPLO_DYNAMIC); //MessageBox(NULL, (LPCWSTR)L"语音识别已启动!", (LPCWSTR)L"提示", MB_CANCELTRYCONTINUE ); //激活语法进行识别 //hr = m_cpDictationGrammar->SetDictationState(SPRS_ACTIVE);//dictation hr = m_cpCmdGrammar->SetRuleState(NULL, NULL, SPRS_ACTIVE);//C&C hr = m_cpRecoEngine->SetRecoState(SPRST_ACTIVE); } else { MessageBox(NULL, (LPCWSTR)L"语音识别引擎启动出错!", (LPCWSTR)L"警告", MB_OK); exit(0); } //释放com资源 ::CoUninitialize(); //hr = m_cpCmdGrammar->SetRuleState(NULL, NULL, SPRS_INACTIVE);//C&C}
(5) Define the message processing function
needs to be placed together with other message processing codes, such as in the code of this article, placed The end of the DlgProc() function of the sample code at the end of the article. The entire other code blocks in this article can be copied directly. You only need to change the following Message response module
//消息处理函数USES_CONVERSION; CSpEvent event; if (m_cpRecoCtxt) { while (event.GetFrom(m_cpRecoCtxt) == S_OK){ switch (event.eEventId) { case SPEI_RECOGNITION: { //识别出了语音 m_bGotReco = TRUE; static const WCHAR wszUnrecognized[] = L"<Unrecognized>"; CSpDynamicString dstrText; ////取得识别结果 if (FAILED(event.RecoResult()->GetText(SP_GETWHOLEPHRASE, SP_GETWHOLEPHRASE, TRUE, &dstrText, NULL))) { dstrText = wszUnrecognized; } BSTR SRout; dstrText.CopyToBSTR(&SRout); CString Recstring; Recstring.Empty(); Recstring = SRout; //做出反应(*****消息反应模块*****) if (Recstring == "发短信") { //MessageBox(NULL, (LPCWSTR)L"好的", (LPCWSTR)L"提示", MB_OK); MSSSpeak(LPCTSTR(_T("好,马上发短信!"))); } else if (Recstring == "李雷") { MSSSpeak(LPCTSTR(_T("好久没看见他了,真是 long time no see"))); } } break; } } }
(6) Modify the syntax file
Modify the CmdCtrl.xml file, you can Improve the recognition of certain words, and the recognition effect of the words inside will be much better, such as people's names, etc. (In addition, when running the exe alone, you also need to put this file in the same folder as the exe. If you do not put it, no error will be reported, but the vocabulary recognition effect in the grammar file will become worse)
<?xml version="1.0" encoding="utf-8"?><GRAMMAR LANGID="804"> <DEFINE> <ID NAME="VID_SubName1" VAL="4001"/> <ID NAME="VID_SubName2" VAL="4002"/> <ID NAME="VID_SubName3" VAL="4003"/> <ID NAME="VID_SubName4" VAL="4004"/> <ID NAME="VID_SubName5" VAL="4005"/> <ID NAME="VID_SubName6" VAL="4006"/> <ID NAME="VID_SubName7" VAL="4007"/> <ID NAME="VID_SubName8" VAL="4008"/> <ID NAME="VID_SubName9" VAL="4009"/> <ID NAME="VID_SubNameRule" VAL="3001"/> <ID NAME="VID_TopLevelRule" VAL="3000"/> </DEFINE> <RULE ID="VID_TopLevelRule" TOPLEVEL="ACTIVE"> <O> <L> <P>我要</P> <P>运行</P> <P>执行</P> </L> </O> <RULEREF REFID="VID_SubNameRule" /> </RULE> <RULE ID="VID_SubNameRule" > <L PROPID="VID_SubNameRule"> <P VAL="VID_SubName1">发短信</P> <P VAL="VID_SubName2">是的</P> <P VAL="VID_SubName3">好的</P> <P VAL="VID_SubName4">不用</P> <P VAL="VID_SubName5">李雷</P> <P VAL="VID_SubName6">韩梅梅</P> <P VAL="VID_SubName7">中文界面</P> <P VAL="VID_SubName8">英文界面</P> <P VAL="VID_SubName9">English</P> </L> </RULE></GRAMMAR>
The above is the detailed content of C++ calls Microsoft's own speech recognition interface to get started quickly. For more information, please follow other related articles on the PHP Chinese website!

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics



In C, the char type is used in strings: 1. Store a single character; 2. Use an array to represent a string and end with a null terminator; 3. Operate through a string operation function; 4. Read or output a string from the keyboard.

The calculation of C35 is essentially combinatorial mathematics, representing the number of combinations selected from 3 of 5 elements. The calculation formula is C53 = 5! / (3! * 2!), which can be directly calculated by loops to improve efficiency and avoid overflow. In addition, understanding the nature of combinations and mastering efficient calculation methods is crucial to solving many problems in the fields of probability statistics, cryptography, algorithm design, etc.

Multithreading in the language can greatly improve program efficiency. There are four main ways to implement multithreading in C language: Create independent processes: Create multiple independently running processes, each process has its own memory space. Pseudo-multithreading: Create multiple execution streams in a process that share the same memory space and execute alternately. Multi-threaded library: Use multi-threaded libraries such as pthreads to create and manage threads, providing rich thread operation functions. Coroutine: A lightweight multi-threaded implementation that divides tasks into small subtasks and executes them in turn.

The main difference between an abstract class and an interface is that an abstract class can contain the implementation of a method, while an interface can only define the signature of a method. 1. Abstract class is defined using abstract keyword, which can contain abstract and concrete methods, suitable for providing default implementations and shared code. 2. The interface is defined using the interface keyword, which only contains method signatures, which is suitable for defining behavioral norms and multiple inheritance.

std::unique removes adjacent duplicate elements in the container and moves them to the end, returning an iterator pointing to the first duplicate element. std::distance calculates the distance between two iterators, that is, the number of elements they point to. These two functions are useful for optimizing code and improving efficiency, but there are also some pitfalls to be paid attention to, such as: std::unique only deals with adjacent duplicate elements. std::distance is less efficient when dealing with non-random access iterators. By mastering these features and best practices, you can fully utilize the power of these two functions.

In C language, snake nomenclature is a coding style convention, which uses underscores to connect multiple words to form variable names or function names to enhance readability. Although it won't affect compilation and operation, lengthy naming, IDE support issues, and historical baggage need to be considered.

The release_semaphore function in C is used to release the obtained semaphore so that other threads or processes can access shared resources. It increases the semaphore count by 1, allowing the blocking thread to continue execution.

Dev-C 4.9.9.2 Compilation Errors and Solutions When compiling programs in Windows 11 system using Dev-C 4.9.9.2, the compiler record pane may display the following error message: gcc.exe:internalerror:aborted(programcollect2)pleasesubmitafullbugreport.seeforinstructions. Although the final "compilation is successful", the actual program cannot run and an error message "original code archive cannot be compiled" pops up. This is usually because the linker collects
