OpenAI Introduces GPT-4o, A Combined Text-audio-vision Chatbot Model

6 months ago
29

According to OpenAI, GPT-4omni accepts as input any combination of text, audio, and image and generates any combination of text, audio, and image outputs. It can respond to audio inputs in as little as 232 milliseconds, with an average of 320 milliseconds, which is similar to human response time(opens in a new window) in a conversation

Loading comments...