Practical Guide to Azure Custom Neural Voice: Essential Tips for Success by info.odysseyx@gmail.com August 26, 2024 written by info.odysseyx@gmail.com August 26, 2024 0 comment 11 views 11 Teaser image created by DALL E 3 Custom Neural Voices (CNV) is a feature of Azure Cognitive Services that lets you create personalized synthetic voices for your applications. This text-to-speech feature lets you use human speech samples as training data to develop a voice that sounds very natural for your brand or character. Recently, while working on a project involving custom voice generation, I encountered some features and hidden issues that are not covered in this document. Official Document. So, I would like to share some tips and tricks in this article. The theoretical aspects are well documented, so the advice in this article is mainly based on my personal experience. I hope you find these insights useful. Let’s get started! Audio recording First, you need to prepare a balanced script. It is more important to have a good mix of questions, exclamations, and statements than to ensure that the training set closely matches the target domain. In short, a good dataset should include: Statement: 70-80% Questions: 10-20% and equal number of rising and falling tones (yes/no questions use rising tones, while wh questions use falling tones very commonly) Exclamations: 10-20% Short words/phrases: 10% sound editing software There are several possible solutions, such as Adobe Audition or Audacity. I recommend using Audacity. Not only because Adobe Audition is paid, but also because Audacity’s limited features are ideal for our needs. We just need to select the speech, export it, and cut it. Minimalism is the key to success. Audacity also makes it easy to navigate the track and minimizes the unnecessary toolbox. The File menu in Audacity provides commands to create, open, and save projects, and import and export audio files. For example, the Export function is not assigned by default, so you can easily create a shortcut to export a selection. This speeds up the process considerably. In my experience using both Adobe Audition and Audacity, I was able to complete the same amount of work in two days using Audacity, compared to four days using Adobe Audition. price Here are my project details: Model Type : Nerve V5.2022.05 Engine version : 2023.01.16.0 Training time : 30.48 Data size : 440 statements price: $1584.27 Pricing may vary depending on engine version and number of training hours, but you will at least get a sample. Intake form You probably know that access is granted only after you complete the Intake Form and that decisions are made based on eligibility and usage criteria. Before providing any project information, please refer to the following: Microsoft’s Responsible AI StandardsThis will allow you to tailor your description and scenario accordingly. Prepare your audio The process is very simple. Create a notepad with all the utterances and their IDs. Select the utterances one by one, export them, save them with their IDs, and then delete them from the notepad. Define the optimal size in advance and do not zoom in or out while working. You will become familiar with the timeline size and will be able to add the 100-200 milliseconds of silence you need more easily. Source link Share 0 FacebookTwitterPinterestEmail info.odysseyx@gmail.com previous post Work Smarter: Copilot Productivity Tips next post Exciting Presales Executive Job Openings at PlanetSpark in Central Delhi, Gurgaon, Chandigarh, and Surrounding Areas You may also like 7 Disturbing Tech Trends of 2024 December 19, 2024 AI on phones fails to impress Apple, Samsung users: Survey December 18, 2024 Standout technology products of 2024 December 16, 2024 Is Intel Equivalent to Tech Industry 2024 NY Giant? December 12, 2024 Google’s Willow chip marks breakthrough in quantum computing December 11, 2024 Job seekers are targeted in mobile phishing campaigns December 10, 2024 Leave a Comment Cancel Reply Save my name, email, and website in this browser for the next time I comment.