OpenAI has unveiled the DALL-E 3 API, along with the introduction of novel text-to-speech models

OpenAI has recently introduced an array of new APIs during its inaugural developer day. Among these offerings, DALL-E 3, which is OpenAI’s text-to-image model, is now accessible through an API. This availability follows its initial integration with ChatGPT and Bing Chat. Similar to its predecessor, the DALL-E 3 API comes with integrated moderation features designed to prevent misuse.

Furthermore, the DALL-E 3 API provides a variety of format and quality choices, with pricing commencing at $0.04 per generated image.

Additionally, OpenAI has introduced a text-to-speech API, presenting users with a selection of six predefined voices and two generative AI model variations. This API is now available, with pricing beginning at $0.015 per 1,000 characters of input.

During the announcement, Sam Altman, OpenAI’s spokesperson, emphasized the naturalness of this technology, stating, “This is much more natural than anything else we’ve heard out there, which can make apps more natural to interact with and more accessible. Furthermore, this technology opens the door to a wide range of applications, including language learning and voice assistance.”

In a related development, OpenAI has unveiled the latest iteration of its open-source automatic speech recognition model, Whisper large-v3. The company asserts that this version exhibits enhanced performance across multiple languages.

Pooja Prajapati

I am Pooja Prajapati, a passionate writer specializing in entrepreneurship, technology, and investments. My love for storytelling drives me to create compelling, insightful, and up-to-date content. My mission is to empower my readers by providing them with the resources they need to thrive in the dynamic world of business. Connect with Pooja Prajapati: pooja@founders40.com

InDALL-E 3, Innovation