Home > Backend Development > Golang > How to convert text to speech in golang

How to convert text to speech in golang

PHPz
Release: 2023-04-24 15:05:48
Original
1794 people have browsed it

With the rapid development of artificial intelligence, voice technology has become an indispensable part of people's daily lives. In many scenarios, we may need to quickly convert text into speech, such as voice reading in education, automatic voice answering in intelligent customer service, voice prompts in car navigation, etc. At this time, the language golang can bring us a more efficient and concise text-to-speech solution. This article will introduce readers to how to use golang to complete text-to-speech.

  1. Install golang third-party package

In golang, we can implement the text-to-speech function through third-party libraries. There are currently two popular libraries on the market, namely go-tts and go-astits. Here, we choose the more mature and stable go-astits.

Use the following command to install the go-astits package in the third library in the src path in your golang installation directory:

go get github.com/mkb218/gosynth/v2
Copy after login
  1. Install the speech engine

go-astits depends on the speech engine. Currently it supports two speech engines: espeak and festival. Here, we choose to use espeak.

Installation espeak method:

sudo apt-get install espeak
Copy after login

After the installation is completed, we can copy the following code to your golang editor and save it as a .go file:

package main

import (
    "fmt"
    "os/exec"
    "time"

    "github.com/mkb218/gosynth/v2/synth"
)

func main() {
    // 调用espeak命令将文本转为音频文件
    err := exec.Command("espeak", "-w", "test.wav", "Hello, World!").Run()
    if err != nil {
        fmt.Println("Failed to convert text to wave file:", err)
        return
    }

    // 播放音频文件
    player := synth.NewWAVFilePlayer("test.wav")
    player.Play()
    time.Sleep(player.Duration())
}
Copy after login

Before running the above code, you need to ensure that the folder where the test.wav file is located has been created. In this code, we call the espeak command through the Command function in the exec package to convert the text into an audio file. At the same time, we use the NewWAVFilePlayer function in the synth package of the go-astits library to play the test.wav audio file.

  1. Call third-party API

In addition to installing the speech engine locally, we can also implement the text-to-speech function by calling the third-party speech API. Commonly used voice APIs include those provided by cloud service providers such as Alibaba Cloud and Tencent Cloud.

Here, we choose to use Baidu speech synthesis API. To use Baidu speech synthesis API, you need to go to https://ai.baidu.com/tech/speech/tts to register and apply for related applications, and apply for the App ID, API Key and Secret Key that can access the API.

Install related golang third-party libraries:

go get github.com/go-resty/resty/v2
go get github.com/leonkaihao/baidu-tts-go/baidu
Copy after login

Write code to interact with Baidu speech synthesis API:

package main

import (
    "fmt"

    "github.com/go-resty/resty/v2"
    "github.com/leonkaihao/baidu-tts-go/baidu"
)

func main() {
    // 获取Access Token
    client := resty.New()
    resp, err := client.R().
        SetFormData(map[string]string{
            "grant_type":    "client_credentials",
            "client_id":     "您的API Key",
            "client_secret": "您的Secret Key",
        }).
        Post("https://aip.baidubce.com/oauth/2.0/token")
    if err != nil {
        fmt.Println("Failed to get token: ", err)
        return
    }
    token := baidu.Token{}
    err = resp.UnmarshalJSON(&token)
    if err != nil {
        fmt.Println("Failed to unmarshal token response: ", err)
        return
    }

    // 调用语音合成API来将文本转为语音
    resp, err = client.R().
        SetHeader("Content-Type", "application/json").
        SetQueryParam("access_token", token.AccessToken).
        SetBody(map[string]string{
            "tex":      "你好,欢迎使用百度语音合成API",
            "lan":      "zh",
            "ctp":      "1",
            "speed":    "5",
            "per":      "4",
            "cuid":     "YOUR_CUID",
            "spd":      "5",
            "vol":      "15",
            "tts":      "audio",
            "aue":      "3",
            "channel":  "1",
            "len":      "-1",
            "pdt":      "",
            "pvc":      "1.0",
            "speaker":  "0",
            "background_music_id": -1,
        }).
        Post("https://tsn.baidu.com/text2audio")
    if err != nil {
        fmt.Println("Failed to request API: ", err)
        return
    }

    fmt.Println(resp.StatusCode())
}
Copy after login

Note that in the above code, you need to put your API Key and Secret Key are replaced with relevant information applied on Baidu Cloud. By calling the Baidu speech synthesis API, we can easily implement the speech synthesis function without installing a local speech engine.

  1. Summary

By using the local speech engine and calling the third-party API, we can quickly implement the text-to-speech function in golang. This article briefly introduces the basic steps of two solutions using go-astits and calling Baidu speech synthesis API. For developers interested in speech synthesis, these solutions provide you with more choices, help you quickly implement functions, and also improve your development efficiency.

The above is the detailed content of How to convert text to speech in golang. For more information, please follow other related articles on the PHP Chinese website!

source:php.cn
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template