The company’s developer conference, OpenAI DevDay, held its second edition recently, ending for AI heavyweight OpenAI. Although there were many products on display at OpenAI DevDay 2023, including the GPT-4 Turbo, Assistants API, and Custom GPTs, among others, the conference this year was more low-key and did not feature any significant new product releases.
The occasion did, however, highlight a few minor improvements and the company’s plans for the future.
OpenAI showcased four innovations
At the event, OpenAI presented four innovations: Prompt Caching, Realtime API, Model Distillation, and Vision Fine-Tuning. To survive in the developer ecosystem, these technologies will assist developers in creating engaging applications.
The organization led by Sam Altman wants to enable developers with OpenAI DevDay 2024. Additionally, the company’s strategy has clearly changed at a time when major IT companies are becoming more competitive with their AI services.
Realtime API
OpenAI has introduced the Realtime API in public beta, allowing paid developers to create low-latency and multimodal experiences in their apps. The API offers natural speech-to-speech conversations with six presets, similar to ChatGPT’s Advanced Voice Mode. OpenAI plans to introduce audio input and output in the Chat Completions API for use cases that don’t require low latency.
Developers can pass any text or audio inputs into GPT-4o and have the model respond using text, audio, or both. This allows users to have engaging natural conversations with apps. OpenAI claims that one can build natural conversational experiences with a single API call.
The Realtime API is currently available in public beta to all paid developers, and the audio capabilities will be released in the coming weeks as a new model called gpt-4o-audio-preview. The Realtime API uses both text and audio tokens, with text input tokens priced at $5 per 1M and output tokens at $20 per 1M.
Vision fine-turning
OpenAI has introduced vision fine-tuning for its GPT-4o large language model, allowing developers to customize the AI model’s comprehension of images and text. This feature can be used in areas like autonomous vehicles, visual search, and medical imaging.
OpenAI states that it follows a similar process to fine-tuning text, allowing developers to prepare image datasets for proper format and upload them to the platform. The feature can improve GPT-4o’s performance for vision tasks with as few as 100 images and higher performance with larger volumes of text and image data.
Grab, a Southeast Asian food delivery and rideshare company, has used the technology to improve its mapping services, achieving a 20% improvement in lane counts and a 13% boost in speed limit sign localisation.
Prompt caching
OpenAI has introduced Prompt Caching, a new feature aimed at reducing costs and latency for developers. This feature allows developers to reuse input tokens, resulting in a 50% discount and faster prompt processing times. Prompt caching is applied to the latest versions of GPT-4o, GPT-4o mini, o1-preview, and o1-mini, and fine-tuned versions of these models.
Cached prompts will be offered at a discount compared to uncached prompts. OpenAI has shared detailed pricing for the feature on its official website. The feature is subject to OpenAI’s Enterprise privacy commitments and enables developers to scale their applications in production while balancing performance, cost, and latency.
Model distillation
OpenAI has introduced Model Distillation, a new feature that allows developers to manage the entire distillation pipeline from within the OpenAI platform. This feature allows developers to use the outputs of frontier models like o1-preview and GPT-4o to fine-tune and augment the efficiency of cost-efficient models like GPT-4o mini.
This could benefit smaller organizations by enabling them to leverage AI outcomes from advanced models without high computational costs. Previously, model distillation was a complex and error-prone process that required multiple operations across disconnected tools.
The new feature simplifies this process by allowing developers to create high-quality datasets using real-world examples. OpenAI’s recent announcements at DevDay demonstrate a strategic shift towards making its products cost-effective, supporting the developer ecosystem, and focusing on model efficiency. The AI powerhouse aims to reduce resource intensity and environmental impact.
GIPHY App Key not set. Please check settings