Home > Tech > Content

Integrating Tencent Cloud Text-to-Speech Using Python

Tech May 12 11

Tencent Cloud offers a Text-to-Speech (TTS) service that converts written text into spoken audio. This capability enables applications too provide voice output for scenarios such as news readers in mobile apps, alerts from smart devices, creating custom voice content with minimal source material, and personalized navigation instrucitons in automotive systems.

Before using the Python SDK, you must obtain security credentials (SecretID and SecretKey) from the Tencent Cloud Console. The SecretID authenticates the API caller, while the SecretKey is used for signing requests and must be kept confidential.

You can install the SDK using pip. Execute the following command in your terminal:

pip install tencentcloud-sdk-python

For environments with both Python 2 and 3, use pip3 for Python 3.

The following script demonstrates basic usage of the TTS API to generate an audio file.

#!/usr/bin/env python
# -*- coding: utf-8 -*-
from base64 import b64decode
from uuid import uuid4
from tencentcloud.common import credential
from tencentcloud.common.exception.tencent_cloud_sdk_exception import TencentCloudSDKException
from tencentcloud.aai.v20180522.models import TextToVoiceRequest
from tencentcloud.aai.v20180522.aai_client import AaiClient

try:
    # Initialize credential object with your SecretID and SecretKey
    auth = credential.Credential("YOUR_SECRET_ID", "YOUR_SECRET_KEY")
    
    # Create a client instance for the TTS service, specifying the region
    tts_client = AaiClient(auth, 'ap-shanghai')
    
    # Instantiate the request object
    tts_request = TextToVoiceRequest()
    
    # Configure the request parameters
    tts_request.Text = 'The quick brown fox jumps over the lazy dog.'
    tts_request.SessionId = str(uuid4())
    tts_request.ModelType = 1
    tts_request.Volume = 5.0
    tts_request.Speed = 0.6
    tts_request.ProjectId = 10086
    tts_request.VoiceType = 0
    tts_request.PrimaryLanguage = 1
    tts_request.SampleRate = 16000
    
    # Send the request and receive the response
    api_response = tts_client.TextToVoice(tts_request)
    
    # The response contains the audio data in base64 encoding and metadata
    # Example structure:
    # {
    #   "Audio": "UklGRl...",
    #   "RequestId": "unique-request-id",
    #   "SessionId": "session-identifier"
    # }
    
    # Decode the base64 audio data to binary
    audio_data = b64decode(api_response.Audio)
    
    # Write the binary data to a WAV file
    output_filename = 'synthesized_speech.wav'
    with open(output_filename, 'wb') as audio_file:
        audio_file.write(audio_data)
        
except TencentCloudSDKException as error:
    print(f"API call failed: {error}")

This example outlines the core process: seting up authentication, configuring the synthesis request with parameters like text, voice type, speed, and volume, executing the call, and saving the returned audio to a file.

Tags: Python Text-to-Speech Tencent Cloud

Back to List

Prev: Longest Increasing Subsequence, Continuous Increasing Subsequence, and Longest Repeating Subarray

Next: Deploying MySQL 8 to Kubernetes for ruoyi-cloud Project

Fading Coder

Integrating Tencent Cloud Text-to-Speech Using Python

Related Articles

Understanding Strong and Weak References in Java

Comprehensive Guide to SSTI Explained with Payload Bypass Techniques

Implement Image Upload Functionality for Django Integrated TinyMCE Editor

Leave a Comment

Copyright © fadingcoder.top

Fading Coder

Integrating Tencent Cloud Text-to-Speech Using Python

Related Articles

Understanding Strong and Weak References in Java

Comprehensive Guide to SSTI Explained with Payload Bypass Techniques

Implement Image Upload Functionality for Django Integrated TinyMCE Editor

Leave a CommentCancel Reply

Copyright © fadingcoder.top

Leave a Comment