ElevenLabs Releases AI-Powered Voice Design API and X to Voice Features


ElevenLabs, a New York-based artificial intelligence (AI) firm, released an application programming interface (API) for its voice design feature, which recently made its debut. The announcement came last week, and along with it, the company also introduced an open-source project called Can generate. This feature also shows a text prompt that is automatically generated based on the analysis of the profile.

In a blog post, ElevenLabs explained in detail about the two new AI tools. The first is the API version of the voice design tool, which was recently introduced. Voice Design is a new capability developed by the company that can generate unique AI voices based on text prompts. These voices are based on details shared by the user, including pitch, timbre, delivery speed, intonation, and more.

Now this feature is being made available through the company’s API. This means that developers can use this capability to create apps and software. Voice design can be offered either by developers to develop voices for their AI characters or to users so that they can generate new voices for themselves.

The company has offered two endpoints. The first allows developers to generate three unique voice previews based on a text prompt. The second allows them to save the voice preview to their library for local use. ElevenLabs did not highlight the price of the API or the cost per request of the AI ​​models. Details about the AI ​​model are also not known.

The second tool is the company’s open-source project called X to Voice. This is an extension of the feature available for testing on the web client here. Users can add an X username and the AI ​​will automatically analyze the profile including bio and posts. Once analyzed, it generates a text prompt based on the analysis.

The text prompt is automatically fed into the voice design to generate a unique voice for the profile. Gadgets 360 tested the feature and found that it took 30 seconds to a minute to generate a sound preview for a profile. In total, three sound previews are generated. The AI ​​voice speaks a line which is also based on the analysis of the profile.

Along with the three voice previews, the page also displays the text prompts that are used to generate the AI ​​voice. We also found that the feature animates profile photos of users who have added close-ups of their faces and syncs lip and mouth movements to match spoken words.

Follow Gadgets 360 for the latest tech news and reviews xFacebook, WhatsApp, Threads and Google News. For the latest videos on gadgets and tech, subscribe to our YouTube channel. If you want to know all about the top influencers, follow our in-house Who’sThat360 on Instagram and YouTube.

Realme GT 7 Pro launched with Snapdragon 8 Elite SoC, 6,500mAh battery: Price, specifications





Source link

Leave a Reply

Your email address will not be published. Required fields are marked *