Skip to content

OpenAI Real-Time API

The OpenAI Realtime API enables the creation of low-latency, multimodal applications capable of real-time interactions. This API supports both text and audio inputs and outputs, allowing for seamless speech-to-speech conversations and real-time transcription services.

Under the hood, the Realtime API establishes a persistent WebSocket connection, facilitating continuous, bidirectional communication between the application and OpenAI's GPT-4o model. This setup enables applications to handle interruptions gracefully and maintain natural conversational flows, similar to human interactions.

Ioto supports the Real-Time API via its WebSocket protocol support. This results in a high-performance direct connection to the LLM.

API Tour

The following example demonstrates using the openaiRealTimeConnect API to establish a WebSocket connection to the OpenAI Real Time API endpoint.

This example loops calling a getUserQuestion API to prompt the user for a question to ask. The text is then sent via the webSocketSend API to the LLM. This example does not include error handling for clarity.

c
static void aiChatRealTime(Web *web)
{
    Url   *up;
    cchar *question;
    char  buf[1024];

    if ((up = openaiRealTimeConnect(NULL)) == NULL) {
        webError(web, 400, "Cannot connect to OpenAI");
        return;
    }
    //  This will cause the onEvent to be invoked for incoming messages
    urlAsync(up, (WebSocketProc) onEvent, up);

    while ((question = getUserQuestion()) != NULL) {
        //  Send the user's input question
        webSocketSend(up->webSocket, SFMT(buf, SDEF({
            type: 'conversation.item.create',
            item: {
                type: 'message',
                role: 'user',
                content: [{
                    type: 'input_text',
                    text: '%s',
                }],
            },
        ), question));

        //  Ask for the model to respond
        webSocketSend(up->webSocket, SDEF({
            type: 'response.create',
            response: {
                modalities: [ 'text' ],
                instructions: 'Please assist the user with their question.',
            },
        ));
        //  Incase not yielding in getUserQuestion, yield here for events to run
        rSleep(0);
    }
    urlFree(up);
}

On connection events, errors and for incoming messages, the onEvent callback will be invoked. The Real Time API sends progressive output via response.text.delta messages. These can be incrementally displayed to the user.

See the Realtime API for details.

c
/*
    Callback for the OpenAI Real Time API.
    This is called when a message is received from OpenAI.
 */
static void onEvent(WebSocket *ws, int event, cchar *message, ssize len, void *arg)
{
    Json    *json;
    cchar   *delta, *type;

    if (event == WS_EVENT_MESSAGE) {
        json = jsonParse(message);
        type = jsonGet(json, 0, "type", 0);
        if (smatch(type, "response.text.delta")) {
            delta = jsonGet(json, 0, "delta", "");
            //  Display the "delta" text to the user
        }
        jsonFree(json);

    } else if (event == WS_EVENT_CLOSE) {
        //  Set a flag to stop processing
    }
}

See the ai app in the Ioto Agent source download for an example real-time.html web page that uses the Real Time API. This sample proxies a browser WebSocket connection to Ioto with the OpenAI Real Time API WebSocket connection.

References

Consult the OpenAI documentation for API details: